You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by michael8 <mi...@saracatech.com> on 2009/11/09 20:05:38 UTC

Re: sanizing/filtering query string for security

Hi Julian,

Saw you post on exactly the question I have.  I'm curious if you got any
response directly, or figured out a way to do this by now that you could
share?  I'm in the same situation trying to 'sanitize' the query string
coming in before handing it to solr.  I do see that characters like ":"
could break the query, but am curious if anyone has come up with a general
solution as I think this must be a fairly common problem for any solr
deployment to tackle.

Thanks,
Michael


Julian Davchev wrote:
> 
> Hi,
> Is there anything special that can be done for sanitizing user input
> before passed as query to solr.
> Not allowing * and ? as first char is only thing I can thing of right
> now. Anything else it should somehow handle.
> 
> I am not able to find any relevant document.
> 
> 

-- 
View this message in context: http://old.nabble.com/sanizing-filtering-query-string-for-security-tp21516844p26271891.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: sanizing/filtering query string for security

Posted by Otis Gospodnetic <ot...@yahoo.com>.
Yes, DisMax does handle the match-all *:* query.

 Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



----- Original Message ----
> From: michael8 <mi...@saracatech.com>
> To: solr-user@lucene.apache.org
> Sent: Mon, November 9, 2009 4:59:33 PM
> Subject: Re: sanizing/filtering query string for security
> 
> 
> Sounds like a nice approach you have  done.  BTW, I have not used DisMax
> handler yet, but does it handle *:* properly?  IOW, do you care if users
> issue this query, or does DisMax treat this query string differently than
> standard request handler?  Basically given my UI, I'm trying to *hide* the
> total count from users searching for *everything*, though this syntax has
> helped me debug/monitor the state of my search doc pool size.
> 
> Thanks,
> Michael
> 
> 
> Alexey-34 wrote:
> > 
> > I added some kind of pre and post processing of Solr results for this,
> > i.e.
> > 
> > If I find fieldname specified in query string in form of
> > "fieldname:term" then I pass this query string to standard request
> > handler, otherwise use DisMaxRequestHandler ( DisMaxRequestHandler
> > doesn't break the query, at least I haven't seen yet ). If standard
> > request handler throws error ( invalid field, too many clauses, etc )
> > then I pass original query to DisMax request handler.
> > 
> > Alex
> > 
> > On Mon, Nov 9, 2009 at 10:05 PM, michael8 wrote:
> >>
> >> Hi Julian,
> >>
> >> Saw you post on exactly the question I have.  I'm curious if you got any
> >> response directly, or figured out a way to do this by now that you could
> >> share?  I'm in the same situation trying to 'sanitize' the query string
> >> coming in before handing it to solr.  I do see that characters like ":"
> >> could break the query, but am curious if anyone has come up with a
> >> general
> >> solution as I think this must be a fairly common problem for any solr
> >> deployment to tackle.
> >>
> >> Thanks,
> >> Michael
> >>
> >>
> >> Julian Davchev wrote:
> >>>
> >>> Hi,
> >>> Is there anything special that can be done for sanitizing user input
> >>> before passed as query to solr.
> >>> Not allowing * and ? as first char is only thing I can thing of right
> >>> now. Anything else it should somehow handle.
> >>>
> >>> I am not able to find any relevant document.
> >>>
> >>>
> >>
> >> --
> >> View this message in context:
> >> 
> http://old.nabble.com/sanizing-filtering-query-string-for-security-tp21516844p26271891.html
> >> Sent from the Solr - User mailing list archive at Nabble.com.
> >>
> >>
> > 
> > 
> 
> -- 
> View this message in context: 
> http://old.nabble.com/sanizing-filtering-query-string-for-security-tp21516844p26274459.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: sanizing/filtering query string for security

Posted by michael8 <mi...@saracatech.com>.
Thanks guys for your input and suggestions!

Michael


Otis Gospodnetic wrote:
> 
> Word of warning:
> Careful with q.alt=*:* if you are dealing with large indices! :)
> 
> Otis
> --
> Sematext is hiring -- http://sematext.com/about/jobs.html?mls
> Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
> 
> 
> 
> ----- Original Message ----
>> From: Alexey Serba <as...@gmail.com>
>> To: solr-user@lucene.apache.org
>> Sent: Mon, November 9, 2009 5:23:52 PM
>> Subject: Re: sanizing/filtering query string for security
>> 
>> > BTW, I have not used DisMax handler yet, but does it handle *:*
>> properly?
>> See q.alt DisMax parameter
>> http://wiki.apache.org/solr/DisMaxRequestHandler#q.alt
>> 
>> You can specify q.alt=*:* and q as empty string to get all results.
>> 
>> > do you care if users issue this query
>> I allow users to issue an empty search and get all results with all
>> facets / etc. It's a nice navigation UI btw.
>> 
>> > Basically given my UI, I'm trying to *hide* the total count from users 
>> searching for *everything*
>> If you don't specify q.alt parameter then Solr returns zero results
>> for empty search. *:* won't work either.
>> 
>> > though this syntax has helped me debug/monitor the state of my search
>> doc pool 
>> size.
>> see q.alt
>> 
>> Alex
>> 
>> On Tue, Nov 10, 2009 at 12:59 AM, michael8 wrote:
>> >
>> > Sounds like a nice approach you have  done.  BTW, I have not used
>> DisMax
>> > handler yet, but does it handle *:* properly?  IOW, do you care if
>> users
>> > issue this query, or does DisMax treat this query string differently
>> than
>> > standard request handler?  Basically given my UI, I'm trying to *hide*
>> the
>> > total count from users searching for *everything*, though this syntax
>> has
>> > helped me debug/monitor the state of my search doc pool size.
>> >
>> > Thanks,
>> > Michael
>> >
>> >
>> > Alexey-34 wrote:
>> >>
>> >> I added some kind of pre and post processing of Solr results for this,
>> >> i.e.
>> >>
>> >> If I find fieldname specified in query string in form of
>> >> "fieldname:term" then I pass this query string to standard request
>> >> handler, otherwise use DisMaxRequestHandler ( DisMaxRequestHandler
>> >> doesn't break the query, at least I haven't seen yet ). If standard
>> >> request handler throws error ( invalid field, too many clauses, etc )
>> >> then I pass original query to DisMax request handler.
>> >>
>> >> Alex
>> >>
>> >> On Mon, Nov 9, 2009 at 10:05 PM, michael8 wrote:
>> >>>
>> >>> Hi Julian,
>> >>>
>> >>> Saw you post on exactly the question I have.  I'm curious if you got
>> any
>> >>> response directly, or figured out a way to do this by now that you
>> could
>> >>> share?  I'm in the same situation trying to 'sanitize' the query
>> string
>> >>> coming in before handing it to solr.  I do see that characters like
>> ":"
>> >>> could break the query, but am curious if anyone has come up with a
>> >>> general
>> >>> solution as I think this must be a fairly common problem for any solr
>> >>> deployment to tackle.
>> >>>
>> >>> Thanks,
>> >>> Michael
>> >>>
>> >>>
>> >>> Julian Davchev wrote:
>> >>>>
>> >>>> Hi,
>> >>>> Is there anything special that can be done for sanitizing user input
>> >>>> before passed as query to solr.
>> >>>> Not allowing * and ? as first char is only thing I can thing of
>> right
>> >>>> now. Anything else it should somehow handle.
>> >>>>
>> >>>> I am not able to find any relevant document.
>> >>>>
>> >>>>
>> >>>
>> >>> --
>> >>> View this message in context:
>> >>> 
>> http://old.nabble.com/sanizing-filtering-query-string-for-security-tp21516844p26271891.html
>> >>> Sent from the Solr - User mailing list archive at Nabble.com.
>> >>>
>> >>>
>> >>
>> >>
>> >
>> > --
>> > View this message in context: 
>> http://old.nabble.com/sanizing-filtering-query-string-for-security-tp21516844p26274459.html
>> > Sent from the Solr - User mailing list archive at Nabble.com.
>> >
>> >
> 
> 
> 

-- 
View this message in context: http://old.nabble.com/sanizing-filtering-query-string-for-security-tp21516844p26283657.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: sanizing/filtering query string for security

Posted by Otis Gospodnetic <ot...@yahoo.com>.
Word of warning:
Careful with q.alt=*:* if you are dealing with large indices! :)

Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



----- Original Message ----
> From: Alexey Serba <as...@gmail.com>
> To: solr-user@lucene.apache.org
> Sent: Mon, November 9, 2009 5:23:52 PM
> Subject: Re: sanizing/filtering query string for security
> 
> > BTW, I have not used DisMax handler yet, but does it handle *:* properly?
> See q.alt DisMax parameter
> http://wiki.apache.org/solr/DisMaxRequestHandler#q.alt
> 
> You can specify q.alt=*:* and q as empty string to get all results.
> 
> > do you care if users issue this query
> I allow users to issue an empty search and get all results with all
> facets / etc. It's a nice navigation UI btw.
> 
> > Basically given my UI, I'm trying to *hide* the total count from users 
> searching for *everything*
> If you don't specify q.alt parameter then Solr returns zero results
> for empty search. *:* won't work either.
> 
> > though this syntax has helped me debug/monitor the state of my search doc pool 
> size.
> see q.alt
> 
> Alex
> 
> On Tue, Nov 10, 2009 at 12:59 AM, michael8 wrote:
> >
> > Sounds like a nice approach you have  done.  BTW, I have not used DisMax
> > handler yet, but does it handle *:* properly?  IOW, do you care if users
> > issue this query, or does DisMax treat this query string differently than
> > standard request handler?  Basically given my UI, I'm trying to *hide* the
> > total count from users searching for *everything*, though this syntax has
> > helped me debug/monitor the state of my search doc pool size.
> >
> > Thanks,
> > Michael
> >
> >
> > Alexey-34 wrote:
> >>
> >> I added some kind of pre and post processing of Solr results for this,
> >> i.e.
> >>
> >> If I find fieldname specified in query string in form of
> >> "fieldname:term" then I pass this query string to standard request
> >> handler, otherwise use DisMaxRequestHandler ( DisMaxRequestHandler
> >> doesn't break the query, at least I haven't seen yet ). If standard
> >> request handler throws error ( invalid field, too many clauses, etc )
> >> then I pass original query to DisMax request handler.
> >>
> >> Alex
> >>
> >> On Mon, Nov 9, 2009 at 10:05 PM, michael8 wrote:
> >>>
> >>> Hi Julian,
> >>>
> >>> Saw you post on exactly the question I have.  I'm curious if you got any
> >>> response directly, or figured out a way to do this by now that you could
> >>> share?  I'm in the same situation trying to 'sanitize' the query string
> >>> coming in before handing it to solr.  I do see that characters like ":"
> >>> could break the query, but am curious if anyone has come up with a
> >>> general
> >>> solution as I think this must be a fairly common problem for any solr
> >>> deployment to tackle.
> >>>
> >>> Thanks,
> >>> Michael
> >>>
> >>>
> >>> Julian Davchev wrote:
> >>>>
> >>>> Hi,
> >>>> Is there anything special that can be done for sanitizing user input
> >>>> before passed as query to solr.
> >>>> Not allowing * and ? as first char is only thing I can thing of right
> >>>> now. Anything else it should somehow handle.
> >>>>
> >>>> I am not able to find any relevant document.
> >>>>
> >>>>
> >>>
> >>> --
> >>> View this message in context:
> >>> 
> http://old.nabble.com/sanizing-filtering-query-string-for-security-tp21516844p26271891.html
> >>> Sent from the Solr - User mailing list archive at Nabble.com.
> >>>
> >>>
> >>
> >>
> >
> > --
> > View this message in context: 
> http://old.nabble.com/sanizing-filtering-query-string-for-security-tp21516844p26274459.html
> > Sent from the Solr - User mailing list archive at Nabble.com.
> >
> >


Re: sanizing/filtering query string for security

Posted by Alexey Serba <as...@gmail.com>.
> BTW, I have not used DisMax handler yet, but does it handle *:* properly?
See q.alt DisMax parameter
http://wiki.apache.org/solr/DisMaxRequestHandler#q.alt

You can specify q.alt=*:* and q as empty string to get all results.

> do you care if users issue this query
I allow users to issue an empty search and get all results with all
facets / etc. It's a nice navigation UI btw.

> Basically given my UI, I'm trying to *hide* the total count from users searching for *everything*
If you don't specify q.alt parameter then Solr returns zero results
for empty search. *:* won't work either.

> though this syntax has helped me debug/monitor the state of my search doc pool size.
see q.alt

Alex

On Tue, Nov 10, 2009 at 12:59 AM, michael8 <mi...@saracatech.com> wrote:
>
> Sounds like a nice approach you have  done.  BTW, I have not used DisMax
> handler yet, but does it handle *:* properly?  IOW, do you care if users
> issue this query, or does DisMax treat this query string differently than
> standard request handler?  Basically given my UI, I'm trying to *hide* the
> total count from users searching for *everything*, though this syntax has
> helped me debug/monitor the state of my search doc pool size.
>
> Thanks,
> Michael
>
>
> Alexey-34 wrote:
>>
>> I added some kind of pre and post processing of Solr results for this,
>> i.e.
>>
>> If I find fieldname specified in query string in form of
>> "fieldname:term" then I pass this query string to standard request
>> handler, otherwise use DisMaxRequestHandler ( DisMaxRequestHandler
>> doesn't break the query, at least I haven't seen yet ). If standard
>> request handler throws error ( invalid field, too many clauses, etc )
>> then I pass original query to DisMax request handler.
>>
>> Alex
>>
>> On Mon, Nov 9, 2009 at 10:05 PM, michael8 <mi...@saracatech.com> wrote:
>>>
>>> Hi Julian,
>>>
>>> Saw you post on exactly the question I have.  I'm curious if you got any
>>> response directly, or figured out a way to do this by now that you could
>>> share?  I'm in the same situation trying to 'sanitize' the query string
>>> coming in before handing it to solr.  I do see that characters like ":"
>>> could break the query, but am curious if anyone has come up with a
>>> general
>>> solution as I think this must be a fairly common problem for any solr
>>> deployment to tackle.
>>>
>>> Thanks,
>>> Michael
>>>
>>>
>>> Julian Davchev wrote:
>>>>
>>>> Hi,
>>>> Is there anything special that can be done for sanitizing user input
>>>> before passed as query to solr.
>>>> Not allowing * and ? as first char is only thing I can thing of right
>>>> now. Anything else it should somehow handle.
>>>>
>>>> I am not able to find any relevant document.
>>>>
>>>>
>>>
>>> --
>>> View this message in context:
>>> http://old.nabble.com/sanizing-filtering-query-string-for-security-tp21516844p26271891.html
>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>>
>>>
>>
>>
>
> --
> View this message in context: http://old.nabble.com/sanizing-filtering-query-string-for-security-tp21516844p26274459.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>

Re: sanizing/filtering query string for security

Posted by michael8 <mi...@saracatech.com>.
Sounds like a nice approach you have  done.  BTW, I have not used DisMax
handler yet, but does it handle *:* properly?  IOW, do you care if users
issue this query, or does DisMax treat this query string differently than
standard request handler?  Basically given my UI, I'm trying to *hide* the
total count from users searching for *everything*, though this syntax has
helped me debug/monitor the state of my search doc pool size.

Thanks,
Michael


Alexey-34 wrote:
> 
> I added some kind of pre and post processing of Solr results for this,
> i.e.
> 
> If I find fieldname specified in query string in form of
> "fieldname:term" then I pass this query string to standard request
> handler, otherwise use DisMaxRequestHandler ( DisMaxRequestHandler
> doesn't break the query, at least I haven't seen yet ). If standard
> request handler throws error ( invalid field, too many clauses, etc )
> then I pass original query to DisMax request handler.
> 
> Alex
> 
> On Mon, Nov 9, 2009 at 10:05 PM, michael8 <mi...@saracatech.com> wrote:
>>
>> Hi Julian,
>>
>> Saw you post on exactly the question I have.  I'm curious if you got any
>> response directly, or figured out a way to do this by now that you could
>> share?  I'm in the same situation trying to 'sanitize' the query string
>> coming in before handing it to solr.  I do see that characters like ":"
>> could break the query, but am curious if anyone has come up with a
>> general
>> solution as I think this must be a fairly common problem for any solr
>> deployment to tackle.
>>
>> Thanks,
>> Michael
>>
>>
>> Julian Davchev wrote:
>>>
>>> Hi,
>>> Is there anything special that can be done for sanitizing user input
>>> before passed as query to solr.
>>> Not allowing * and ? as first char is only thing I can thing of right
>>> now. Anything else it should somehow handle.
>>>
>>> I am not able to find any relevant document.
>>>
>>>
>>
>> --
>> View this message in context:
>> http://old.nabble.com/sanizing-filtering-query-string-for-security-tp21516844p26271891.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 

-- 
View this message in context: http://old.nabble.com/sanizing-filtering-query-string-for-security-tp21516844p26274459.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: sanizing/filtering query string for security

Posted by Alexey Serba <as...@gmail.com>.
I added some kind of pre and post processing of Solr results for this, i.e.

If I find fieldname specified in query string in form of
"fieldname:term" then I pass this query string to standard request
handler, otherwise use DisMaxRequestHandler ( DisMaxRequestHandler
doesn't break the query, at least I haven't seen yet ). If standard
request handler throws error ( invalid field, too many clauses, etc )
then I pass original query to DisMax request handler.

Alex

On Mon, Nov 9, 2009 at 10:05 PM, michael8 <mi...@saracatech.com> wrote:
>
> Hi Julian,
>
> Saw you post on exactly the question I have.  I'm curious if you got any
> response directly, or figured out a way to do this by now that you could
> share?  I'm in the same situation trying to 'sanitize' the query string
> coming in before handing it to solr.  I do see that characters like ":"
> could break the query, but am curious if anyone has come up with a general
> solution as I think this must be a fairly common problem for any solr
> deployment to tackle.
>
> Thanks,
> Michael
>
>
> Julian Davchev wrote:
>>
>> Hi,
>> Is there anything special that can be done for sanitizing user input
>> before passed as query to solr.
>> Not allowing * and ? as first char is only thing I can thing of right
>> now. Anything else it should somehow handle.
>>
>> I am not able to find any relevant document.
>>
>>
>
> --
> View this message in context: http://old.nabble.com/sanizing-filtering-query-string-for-security-tp21516844p26271891.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>