You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Marco Aurélio <au...@gmail.com> on 2020/10/08 16:06:44 UTC

Solr endpoint on the public internet

Hi!

We're looking into the option of setting up search with Solr without an
intermediary application. This would mean our backend would index data into
Solr and we would have a public Solr endpoint on the internet that would
receive search requests directly.

Since I couldn't find an existing solution similar to ours, I would like to
know whether it's possible to secure Solr in a way that allows anyone only
read-access only to collections and how to achieve that. Specifically
because of this part of the documentation
<https://lucene.apache.org/solr/guide/8_5/securing-solr.html>:

*No Solr API, including the Admin UI, is designed to be exposed to
non-trusted parties. Tune your firewall so that only trusted computers and
people are allowed access. Because of this, the project will not regard
e.g., Admin UI XSS issues as security vulnerabilities. However, we still
ask you to report such issues in JIRA.*
Is there a way we can restrict read-only access to Solr collections so as
to allow users to make search requests directly to it or should we always
keep our Solr instances completely private?

Thanks in advance!

Best regards,
Marco Godinho

Re: Solr endpoint on the public internet

Posted by Alexandre Rafalovitch <ar...@gmail.com>.
Could be fun red/blue team exercise. Just watch out for those
cryptominors that get in through Solr injection (among many other
unsecured methods) and are a real pain to remove.

Regards,
   Alex.
P.s. Don't ask me how I know :-(
P.p.s. Read-only docker container may still be a good layer of defence
on top of everything. Respawn it every hour, if needed.

On Thu, 8 Oct 2020 at 15:05, David Hastings <dh...@wshein.com> wrote:
>
> Welp. Never mind I refer back to point #1 this is a bad idea
>
> > On Oct 8, 2020, at 3:01 PM, Alexandre Rafalovitch <ar...@gmail.com> wrote:
> >
> > The update handlers are now implicitly defined (3 or 4 of them). So,
> > it actually needs to be explicitly shadowed and overridden with other
> > Noop handler. And block Config API to avoid attackers creating new
> > handlers.
> >
> > Regards,
> >   Alex.
> >
> >> On Thu, 8 Oct 2020 at 14:54, David Hastings <dh...@wshein.com> wrote:
> >>
> >> Well that’s why I suggested deleting the update handler :)
> >>
> >>>> On Oct 8, 2020, at 2:52 PM, Walter Underwood <wu...@wunderwood.org> wrote:
> >>>
> >>> Let me know where it is and I’ll delete all the documents in your collection.
> >>> It is easy, just one HTTP request.
> >>>
> >>> https://gist.github.com/nz/673027/313f70681daa985ea13ba33a385753aef951a0f3
> >>>
> >>> wunder
> >>> Walter Underwood
> >>> wunder@wunderwood.org
> >>> http://observer.wunderwood.org/  (my blog)
> >>>
> >>>> On Oct 8, 2020, at 11:49 AM, Alexandre Rafalovitch <ar...@gmail.com> wrote:
> >>>>
> >>>> I think there were past discussions about people doing but they really
> >>>> really knew what they were doing from a security perspective, not just
> >>>> Solr one.
> >>>>
> >>>> You are increasing your risk factor a lot, so you need to think
> >>>> through this. What are you protecting and what are you exposing. Are
> >>>> you trying to protect the updates? You may be able to do it with - for
> >>>> example - read-only docker container, or with embedded Solr or/and
> >>>> with reverse proxy.
> >>>>
> >>>> Are you trying to protect some of the data from being read? Even harder.
> >>>>
> >>>> There are implicit handlers, admin handlers, 'qt' to select query
> >>>> parser, etc. Lots of things to think about.
> >>>>
> >>>> It just may not be worth it.
> >>>>
> >>>> Regards,
> >>>> Alex.
> >>>>
> >>>>
> >>>>> On Thu, 8 Oct 2020 at 14:27, Marco Aurélio <au...@gmail.com> wrote:
> >>>>>
> >>>>> Hi!
> >>>>>
> >>>>> We're looking into the option of setting up search with Solr without an
> >>>>> intermediary application. This would mean our backend would index data into
> >>>>> Solr and we would have a public Solr endpoint on the internet that would
> >>>>> receive search requests directly.
> >>>>>
> >>>>> Since I couldn't find an existing solution similar to ours, I would like to
> >>>>> know whether it's possible to secure Solr in a way that allows anyone only
> >>>>> read-access only to collections and how to achieve that. Specifically
> >>>>> because of this part of the documentation
> >>>>> <https://lucene.apache.org/solr/guide/8_5/securing-solr.html>:
> >>>>>
> >>>>> *No Solr API, including the Admin UI, is designed to be exposed to
> >>>>> non-trusted parties. Tune your firewall so that only trusted computers and
> >>>>> people are allowed access. Because of this, the project will not regard
> >>>>> e.g., Admin UI XSS issues as security vulnerabilities. However, we still
> >>>>> ask you to report such issues in JIRA.*
> >>>>> Is there a way we can restrict read-only access to Solr collections so as
> >>>>> to allow users to make search requests directly to it or should we always
> >>>>> keep our Solr instances completely private?
> >>>>>
> >>>>> Thanks in advance!
> >>>>>
> >>>>> Best regards,
> >>>>> Marco Godinho
> >>>

Re: Solr endpoint on the public internet

Posted by David Hastings <dh...@wshein.com>.
Welp. Never mind I refer back to point #1 this is a bad idea 

> On Oct 8, 2020, at 3:01 PM, Alexandre Rafalovitch <ar...@gmail.com> wrote:
> 
> The update handlers are now implicitly defined (3 or 4 of them). So,
> it actually needs to be explicitly shadowed and overridden with other
> Noop handler. And block Config API to avoid attackers creating new
> handlers.
> 
> Regards,
>   Alex.
> 
>> On Thu, 8 Oct 2020 at 14:54, David Hastings <dh...@wshein.com> wrote:
>> 
>> Well that’s why I suggested deleting the update handler :)
>> 
>>>> On Oct 8, 2020, at 2:52 PM, Walter Underwood <wu...@wunderwood.org> wrote:
>>> 
>>> Let me know where it is and I’ll delete all the documents in your collection.
>>> It is easy, just one HTTP request.
>>> 
>>> https://gist.github.com/nz/673027/313f70681daa985ea13ba33a385753aef951a0f3
>>> 
>>> wunder
>>> Walter Underwood
>>> wunder@wunderwood.org
>>> http://observer.wunderwood.org/  (my blog)
>>> 
>>>> On Oct 8, 2020, at 11:49 AM, Alexandre Rafalovitch <ar...@gmail.com> wrote:
>>>> 
>>>> I think there were past discussions about people doing but they really
>>>> really knew what they were doing from a security perspective, not just
>>>> Solr one.
>>>> 
>>>> You are increasing your risk factor a lot, so you need to think
>>>> through this. What are you protecting and what are you exposing. Are
>>>> you trying to protect the updates? You may be able to do it with - for
>>>> example - read-only docker container, or with embedded Solr or/and
>>>> with reverse proxy.
>>>> 
>>>> Are you trying to protect some of the data from being read? Even harder.
>>>> 
>>>> There are implicit handlers, admin handlers, 'qt' to select query
>>>> parser, etc. Lots of things to think about.
>>>> 
>>>> It just may not be worth it.
>>>> 
>>>> Regards,
>>>> Alex.
>>>> 
>>>> 
>>>>> On Thu, 8 Oct 2020 at 14:27, Marco Aurélio <au...@gmail.com> wrote:
>>>>> 
>>>>> Hi!
>>>>> 
>>>>> We're looking into the option of setting up search with Solr without an
>>>>> intermediary application. This would mean our backend would index data into
>>>>> Solr and we would have a public Solr endpoint on the internet that would
>>>>> receive search requests directly.
>>>>> 
>>>>> Since I couldn't find an existing solution similar to ours, I would like to
>>>>> know whether it's possible to secure Solr in a way that allows anyone only
>>>>> read-access only to collections and how to achieve that. Specifically
>>>>> because of this part of the documentation
>>>>> <https://lucene.apache.org/solr/guide/8_5/securing-solr.html>:
>>>>> 
>>>>> *No Solr API, including the Admin UI, is designed to be exposed to
>>>>> non-trusted parties. Tune your firewall so that only trusted computers and
>>>>> people are allowed access. Because of this, the project will not regard
>>>>> e.g., Admin UI XSS issues as security vulnerabilities. However, we still
>>>>> ask you to report such issues in JIRA.*
>>>>> Is there a way we can restrict read-only access to Solr collections so as
>>>>> to allow users to make search requests directly to it or should we always
>>>>> keep our Solr instances completely private?
>>>>> 
>>>>> Thanks in advance!
>>>>> 
>>>>> Best regards,
>>>>> Marco Godinho
>>> 

Re: Solr endpoint on the public internet

Posted by Alexandre Rafalovitch <ar...@gmail.com>.
The update handlers are now implicitly defined (3 or 4 of them). So,
it actually needs to be explicitly shadowed and overridden with other
Noop handler. And block Config API to avoid attackers creating new
handlers.

Regards,
   Alex.

On Thu, 8 Oct 2020 at 14:54, David Hastings <dh...@wshein.com> wrote:
>
> Well that’s why I suggested deleting the update handler :)
>
> > On Oct 8, 2020, at 2:52 PM, Walter Underwood <wu...@wunderwood.org> wrote:
> >
> > Let me know where it is and I’ll delete all the documents in your collection.
> > It is easy, just one HTTP request.
> >
> > https://gist.github.com/nz/673027/313f70681daa985ea13ba33a385753aef951a0f3
> >
> > wunder
> > Walter Underwood
> > wunder@wunderwood.org
> > http://observer.wunderwood.org/  (my blog)
> >
> >> On Oct 8, 2020, at 11:49 AM, Alexandre Rafalovitch <ar...@gmail.com> wrote:
> >>
> >> I think there were past discussions about people doing but they really
> >> really knew what they were doing from a security perspective, not just
> >> Solr one.
> >>
> >> You are increasing your risk factor a lot, so you need to think
> >> through this. What are you protecting and what are you exposing. Are
> >> you trying to protect the updates? You may be able to do it with - for
> >> example - read-only docker container, or with embedded Solr or/and
> >> with reverse proxy.
> >>
> >> Are you trying to protect some of the data from being read? Even harder.
> >>
> >> There are implicit handlers, admin handlers, 'qt' to select query
> >> parser, etc. Lots of things to think about.
> >>
> >> It just may not be worth it.
> >>
> >> Regards,
> >>  Alex.
> >>
> >>
> >>> On Thu, 8 Oct 2020 at 14:27, Marco Aurélio <au...@gmail.com> wrote:
> >>>
> >>> Hi!
> >>>
> >>> We're looking into the option of setting up search with Solr without an
> >>> intermediary application. This would mean our backend would index data into
> >>> Solr and we would have a public Solr endpoint on the internet that would
> >>> receive search requests directly.
> >>>
> >>> Since I couldn't find an existing solution similar to ours, I would like to
> >>> know whether it's possible to secure Solr in a way that allows anyone only
> >>> read-access only to collections and how to achieve that. Specifically
> >>> because of this part of the documentation
> >>> <https://lucene.apache.org/solr/guide/8_5/securing-solr.html>:
> >>>
> >>> *No Solr API, including the Admin UI, is designed to be exposed to
> >>> non-trusted parties. Tune your firewall so that only trusted computers and
> >>> people are allowed access. Because of this, the project will not regard
> >>> e.g., Admin UI XSS issues as security vulnerabilities. However, we still
> >>> ask you to report such issues in JIRA.*
> >>> Is there a way we can restrict read-only access to Solr collections so as
> >>> to allow users to make search requests directly to it or should we always
> >>> keep our Solr instances completely private?
> >>>
> >>> Thanks in advance!
> >>>
> >>> Best regards,
> >>> Marco Godinho
> >

Re: Solr endpoint on the public internet

Posted by David Hastings <dh...@wshein.com>.
Well that’s why I suggested deleting the update handler :)

> On Oct 8, 2020, at 2:52 PM, Walter Underwood <wu...@wunderwood.org> wrote:
> 
> Let me know where it is and I’ll delete all the documents in your collection.
> It is easy, just one HTTP request.
> 
> https://gist.github.com/nz/673027/313f70681daa985ea13ba33a385753aef951a0f3
> 
> wunder
> Walter Underwood
> wunder@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
> 
>> On Oct 8, 2020, at 11:49 AM, Alexandre Rafalovitch <ar...@gmail.com> wrote:
>> 
>> I think there were past discussions about people doing but they really
>> really knew what they were doing from a security perspective, not just
>> Solr one.
>> 
>> You are increasing your risk factor a lot, so you need to think
>> through this. What are you protecting and what are you exposing. Are
>> you trying to protect the updates? You may be able to do it with - for
>> example - read-only docker container, or with embedded Solr or/and
>> with reverse proxy.
>> 
>> Are you trying to protect some of the data from being read? Even harder.
>> 
>> There are implicit handlers, admin handlers, 'qt' to select query
>> parser, etc. Lots of things to think about.
>> 
>> It just may not be worth it.
>> 
>> Regards,
>>  Alex.
>> 
>> 
>>> On Thu, 8 Oct 2020 at 14:27, Marco Aurélio <au...@gmail.com> wrote:
>>> 
>>> Hi!
>>> 
>>> We're looking into the option of setting up search with Solr without an
>>> intermediary application. This would mean our backend would index data into
>>> Solr and we would have a public Solr endpoint on the internet that would
>>> receive search requests directly.
>>> 
>>> Since I couldn't find an existing solution similar to ours, I would like to
>>> know whether it's possible to secure Solr in a way that allows anyone only
>>> read-access only to collections and how to achieve that. Specifically
>>> because of this part of the documentation
>>> <https://lucene.apache.org/solr/guide/8_5/securing-solr.html>:
>>> 
>>> *No Solr API, including the Admin UI, is designed to be exposed to
>>> non-trusted parties. Tune your firewall so that only trusted computers and
>>> people are allowed access. Because of this, the project will not regard
>>> e.g., Admin UI XSS issues as security vulnerabilities. However, we still
>>> ask you to report such issues in JIRA.*
>>> Is there a way we can restrict read-only access to Solr collections so as
>>> to allow users to make search requests directly to it or should we always
>>> keep our Solr instances completely private?
>>> 
>>> Thanks in advance!
>>> 
>>> Best regards,
>>> Marco Godinho
> 

Re: Solr endpoint on the public internet

Posted by Walter Underwood <wu...@wunderwood.org>.
Let me know where it is and I’ll delete all the documents in your collection.
It is easy, just one HTTP request.

https://gist.github.com/nz/673027/313f70681daa985ea13ba33a385753aef951a0f3

wunder
Walter Underwood
wunder@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Oct 8, 2020, at 11:49 AM, Alexandre Rafalovitch <ar...@gmail.com> wrote:
> 
> I think there were past discussions about people doing but they really
> really knew what they were doing from a security perspective, not just
> Solr one.
> 
> You are increasing your risk factor a lot, so you need to think
> through this. What are you protecting and what are you exposing. Are
> you trying to protect the updates? You may be able to do it with - for
> example - read-only docker container, or with embedded Solr or/and
> with reverse proxy.
> 
> Are you trying to protect some of the data from being read? Even harder.
> 
> There are implicit handlers, admin handlers, 'qt' to select query
> parser, etc. Lots of things to think about.
> 
> It just may not be worth it.
> 
> Regards,
>   Alex.
> 
> 
> On Thu, 8 Oct 2020 at 14:27, Marco Aurélio <au...@gmail.com> wrote:
>> 
>> Hi!
>> 
>> We're looking into the option of setting up search with Solr without an
>> intermediary application. This would mean our backend would index data into
>> Solr and we would have a public Solr endpoint on the internet that would
>> receive search requests directly.
>> 
>> Since I couldn't find an existing solution similar to ours, I would like to
>> know whether it's possible to secure Solr in a way that allows anyone only
>> read-access only to collections and how to achieve that. Specifically
>> because of this part of the documentation
>> <https://lucene.apache.org/solr/guide/8_5/securing-solr.html>:
>> 
>> *No Solr API, including the Admin UI, is designed to be exposed to
>> non-trusted parties. Tune your firewall so that only trusted computers and
>> people are allowed access. Because of this, the project will not regard
>> e.g., Admin UI XSS issues as security vulnerabilities. However, we still
>> ask you to report such issues in JIRA.*
>> Is there a way we can restrict read-only access to Solr collections so as
>> to allow users to make search requests directly to it or should we always
>> keep our Solr instances completely private?
>> 
>> Thanks in advance!
>> 
>> Best regards,
>> Marco Godinho


Re: Solr endpoint on the public internet

Posted by Alexandre Rafalovitch <ar...@gmail.com>.
I think there were past discussions about people doing but they really
really knew what they were doing from a security perspective, not just
Solr one.

You are increasing your risk factor a lot, so you need to think
through this. What are you protecting and what are you exposing. Are
you trying to protect the updates? You may be able to do it with - for
example - read-only docker container, or with embedded Solr or/and
with reverse proxy.

Are you trying to protect some of the data from being read? Even harder.

There are implicit handlers, admin handlers, 'qt' to select query
parser, etc. Lots of things to think about.

It just may not be worth it.

Regards,
   Alex.


On Thu, 8 Oct 2020 at 14:27, Marco Aurélio <au...@gmail.com> wrote:
>
> Hi!
>
> We're looking into the option of setting up search with Solr without an
> intermediary application. This would mean our backend would index data into
> Solr and we would have a public Solr endpoint on the internet that would
> receive search requests directly.
>
> Since I couldn't find an existing solution similar to ours, I would like to
> know whether it's possible to secure Solr in a way that allows anyone only
> read-access only to collections and how to achieve that. Specifically
> because of this part of the documentation
> <https://lucene.apache.org/solr/guide/8_5/securing-solr.html>:
>
> *No Solr API, including the Admin UI, is designed to be exposed to
> non-trusted parties. Tune your firewall so that only trusted computers and
> people are allowed access. Because of this, the project will not regard
> e.g., Admin UI XSS issues as security vulnerabilities. However, we still
> ask you to report such issues in JIRA.*
> Is there a way we can restrict read-only access to Solr collections so as
> to allow users to make search requests directly to it or should we always
> keep our Solr instances completely private?
>
> Thanks in advance!
>
> Best regards,
> Marco Godinho

Re: Solr endpoint on the public internet

Posted by Dave <ha...@gmail.com>.
#1. This is a HORRIBLE IDEA
#2 If I was going to do this I would destroy the update request handler as well as the entire admin ui from the solr instance, set up a replication from a secure solr instance on an interval. This way no one could send an update /delete command, you could still update the index, and still be readable. Just remove any request handler that isn’t a search or replicate, and put the replication only on a port shared between the master and slave, 

> On Oct 8, 2020, at 2:27 PM, Marco Aurélio <au...@gmail.com> wrote:
> 
> Hi!
> 
> We're looking into the option of setting up search with Solr without an
> intermediary application. This would mean our backend would index data into
> Solr and we would have a public Solr endpoint on the internet that would
> receive search requests directly.
> 
> Since I couldn't find an existing solution similar to ours, I would like to
> know whether it's possible to secure Solr in a way that allows anyone only
> read-access only to collections and how to achieve that. Specifically
> because of this part of the documentation
> <https://lucene.apache.org/solr/guide/8_5/securing-solr.html>:
> 
> *No Solr API, including the Admin UI, is designed to be exposed to
> non-trusted parties. Tune your firewall so that only trusted computers and
> people are allowed access. Because of this, the project will not regard
> e.g., Admin UI XSS issues as security vulnerabilities. However, we still
> ask you to report such issues in JIRA.*
> Is there a way we can restrict read-only access to Solr collections so as
> to allow users to make search requests directly to it or should we always
> keep our Solr instances completely private?
> 
> Thanks in advance!
> 
> Best regards,
> Marco Godinho

Re: Solr endpoint on the public internet

Posted by Jörn Franke <jo...@gmail.com>.
It is like opening a database to the Internet - you simply don’t do it and I don’t recommend it.

If you despite the anti pattern want to do it  use the latest Solr versions and put a reverse proxy in front. Always use authentication and authorization. Do only allow a minimal API endpoints and no admin UI. Limit IPs that can access it. Do not use it for confidential data. 
If data (even public one!) gets leaked from your Solr instance it is very bad for the reputation of your Organisation.

Future versions allow to disable security problematic modules. Better wait for them. Still I would not do it in the first place - you also would not open databases to the Internet. I could also not find a use case for which this is needed.

> Am 08.10.2020 um 20:27 schrieb Marco Aurélio <au...@gmail.com>:
> 
> Hi!
> 
> We're looking into the option of setting up search with Solr without an
> intermediary application. This would mean our backend would index data into
> Solr and we would have a public Solr endpoint on the internet that would
> receive search requests directly.
> 
> Since I couldn't find an existing solution similar to ours, I would like to
> know whether it's possible to secure Solr in a way that allows anyone only
> read-access only to collections and how to achieve that. Specifically
> because of this part of the documentation
> <https://lucene.apache.org/solr/guide/8_5/securing-solr.html>:
> 
> *No Solr API, including the Admin UI, is designed to be exposed to
> non-trusted parties. Tune your firewall so that only trusted computers and
> people are allowed access. Because of this, the project will not regard
> e.g., Admin UI XSS issues as security vulnerabilities. However, we still
> ask you to report such issues in JIRA.*
> Is there a way we can restrict read-only access to Solr collections so as
> to allow users to make search requests directly to it or should we always
> keep our Solr instances completely private?
> 
> Thanks in advance!
> 
> Best regards,
> Marco Godinho