You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Julian Davchev <jm...@drun.net> on 2009/01/17 15:13:48 UTC
sanizing/filtering query string for security
Hi,
Is there anything special that can be done for sanitizing user input
before passed as query to solr.
Not allowing * and ? as first char is only thing I can thing of right
now. Anything else it should somehow handle.
I am not able to find any relevant document.
Re: sanizing/filtering query string for security
Posted by Otis Gospodnetic <ot...@yahoo.com>.
Yes, DisMax does handle the match-all *:* query.
Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
----- Original Message ----
> From: michael8 <mi...@saracatech.com>
> To: solr-user@lucene.apache.org
> Sent: Mon, November 9, 2009 4:59:33 PM
> Subject: Re: sanizing/filtering query string for security
>
>
> Sounds like a nice approach you have done. BTW, I have not used DisMax
> handler yet, but does it handle *:* properly? IOW, do you care if users
> issue this query, or does DisMax treat this query string differently than
> standard request handler? Basically given my UI, I'm trying to *hide* the
> total count from users searching for *everything*, though this syntax has
> helped me debug/monitor the state of my search doc pool size.
>
> Thanks,
> Michael
>
>
> Alexey-34 wrote:
> >
> > I added some kind of pre and post processing of Solr results for this,
> > i.e.
> >
> > If I find fieldname specified in query string in form of
> > "fieldname:term" then I pass this query string to standard request
> > handler, otherwise use DisMaxRequestHandler ( DisMaxRequestHandler
> > doesn't break the query, at least I haven't seen yet ). If standard
> > request handler throws error ( invalid field, too many clauses, etc )
> > then I pass original query to DisMax request handler.
> >
> > Alex
> >
> > On Mon, Nov 9, 2009 at 10:05 PM, michael8 wrote:
> >>
> >> Hi Julian,
> >>
> >> Saw you post on exactly the question I have. I'm curious if you got any
> >> response directly, or figured out a way to do this by now that you could
> >> share? I'm in the same situation trying to 'sanitize' the query string
> >> coming in before handing it to solr. I do see that characters like ":"
> >> could break the query, but am curious if anyone has come up with a
> >> general
> >> solution as I think this must be a fairly common problem for any solr
> >> deployment to tackle.
> >>
> >> Thanks,
> >> Michael
> >>
> >>
> >> Julian Davchev wrote:
> >>>
> >>> Hi,
> >>> Is there anything special that can be done for sanitizing user input
> >>> before passed as query to solr.
> >>> Not allowing * and ? as first char is only thing I can thing of right
> >>> now. Anything else it should somehow handle.
> >>>
> >>> I am not able to find any relevant document.
> >>>
> >>>
> >>
> >> --
> >> View this message in context:
> >>
> http://old.nabble.com/sanizing-filtering-query-string-for-security-tp21516844p26271891.html
> >> Sent from the Solr - User mailing list archive at Nabble.com.
> >>
> >>
> >
> >
>
> --
> View this message in context:
> http://old.nabble.com/sanizing-filtering-query-string-for-security-tp21516844p26274459.html
> Sent from the Solr - User mailing list archive at Nabble.com.
Re: sanizing/filtering query string for security
Posted by michael8 <mi...@saracatech.com>.
Thanks guys for your input and suggestions!
Michael
Otis Gospodnetic wrote:
>
> Word of warning:
> Careful with q.alt=*:* if you are dealing with large indices! :)
>
> Otis
> --
> Sematext is hiring -- http://sematext.com/about/jobs.html?mls
> Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
>
>
>
> ----- Original Message ----
>> From: Alexey Serba <as...@gmail.com>
>> To: solr-user@lucene.apache.org
>> Sent: Mon, November 9, 2009 5:23:52 PM
>> Subject: Re: sanizing/filtering query string for security
>>
>> > BTW, I have not used DisMax handler yet, but does it handle *:*
>> properly?
>> See q.alt DisMax parameter
>> http://wiki.apache.org/solr/DisMaxRequestHandler#q.alt
>>
>> You can specify q.alt=*:* and q as empty string to get all results.
>>
>> > do you care if users issue this query
>> I allow users to issue an empty search and get all results with all
>> facets / etc. It's a nice navigation UI btw.
>>
>> > Basically given my UI, I'm trying to *hide* the total count from users
>> searching for *everything*
>> If you don't specify q.alt parameter then Solr returns zero results
>> for empty search. *:* won't work either.
>>
>> > though this syntax has helped me debug/monitor the state of my search
>> doc pool
>> size.
>> see q.alt
>>
>> Alex
>>
>> On Tue, Nov 10, 2009 at 12:59 AM, michael8 wrote:
>> >
>> > Sounds like a nice approach you have done. BTW, I have not used
>> DisMax
>> > handler yet, but does it handle *:* properly? IOW, do you care if
>> users
>> > issue this query, or does DisMax treat this query string differently
>> than
>> > standard request handler? Basically given my UI, I'm trying to *hide*
>> the
>> > total count from users searching for *everything*, though this syntax
>> has
>> > helped me debug/monitor the state of my search doc pool size.
>> >
>> > Thanks,
>> > Michael
>> >
>> >
>> > Alexey-34 wrote:
>> >>
>> >> I added some kind of pre and post processing of Solr results for this,
>> >> i.e.
>> >>
>> >> If I find fieldname specified in query string in form of
>> >> "fieldname:term" then I pass this query string to standard request
>> >> handler, otherwise use DisMaxRequestHandler ( DisMaxRequestHandler
>> >> doesn't break the query, at least I haven't seen yet ). If standard
>> >> request handler throws error ( invalid field, too many clauses, etc )
>> >> then I pass original query to DisMax request handler.
>> >>
>> >> Alex
>> >>
>> >> On Mon, Nov 9, 2009 at 10:05 PM, michael8 wrote:
>> >>>
>> >>> Hi Julian,
>> >>>
>> >>> Saw you post on exactly the question I have. I'm curious if you got
>> any
>> >>> response directly, or figured out a way to do this by now that you
>> could
>> >>> share? I'm in the same situation trying to 'sanitize' the query
>> string
>> >>> coming in before handing it to solr. I do see that characters like
>> ":"
>> >>> could break the query, but am curious if anyone has come up with a
>> >>> general
>> >>> solution as I think this must be a fairly common problem for any solr
>> >>> deployment to tackle.
>> >>>
>> >>> Thanks,
>> >>> Michael
>> >>>
>> >>>
>> >>> Julian Davchev wrote:
>> >>>>
>> >>>> Hi,
>> >>>> Is there anything special that can be done for sanitizing user input
>> >>>> before passed as query to solr.
>> >>>> Not allowing * and ? as first char is only thing I can thing of
>> right
>> >>>> now. Anything else it should somehow handle.
>> >>>>
>> >>>> I am not able to find any relevant document.
>> >>>>
>> >>>>
>> >>>
>> >>> --
>> >>> View this message in context:
>> >>>
>> http://old.nabble.com/sanizing-filtering-query-string-for-security-tp21516844p26271891.html
>> >>> Sent from the Solr - User mailing list archive at Nabble.com.
>> >>>
>> >>>
>> >>
>> >>
>> >
>> > --
>> > View this message in context:
>> http://old.nabble.com/sanizing-filtering-query-string-for-security-tp21516844p26274459.html
>> > Sent from the Solr - User mailing list archive at Nabble.com.
>> >
>> >
>
>
>
--
View this message in context: http://old.nabble.com/sanizing-filtering-query-string-for-security-tp21516844p26283657.html
Sent from the Solr - User mailing list archive at Nabble.com.
Re: sanizing/filtering query string for security
Posted by Otis Gospodnetic <ot...@yahoo.com>.
Word of warning:
Careful with q.alt=*:* if you are dealing with large indices! :)
Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
----- Original Message ----
> From: Alexey Serba <as...@gmail.com>
> To: solr-user@lucene.apache.org
> Sent: Mon, November 9, 2009 5:23:52 PM
> Subject: Re: sanizing/filtering query string for security
>
> > BTW, I have not used DisMax handler yet, but does it handle *:* properly?
> See q.alt DisMax parameter
> http://wiki.apache.org/solr/DisMaxRequestHandler#q.alt
>
> You can specify q.alt=*:* and q as empty string to get all results.
>
> > do you care if users issue this query
> I allow users to issue an empty search and get all results with all
> facets / etc. It's a nice navigation UI btw.
>
> > Basically given my UI, I'm trying to *hide* the total count from users
> searching for *everything*
> If you don't specify q.alt parameter then Solr returns zero results
> for empty search. *:* won't work either.
>
> > though this syntax has helped me debug/monitor the state of my search doc pool
> size.
> see q.alt
>
> Alex
>
> On Tue, Nov 10, 2009 at 12:59 AM, michael8 wrote:
> >
> > Sounds like a nice approach you have done. BTW, I have not used DisMax
> > handler yet, but does it handle *:* properly? IOW, do you care if users
> > issue this query, or does DisMax treat this query string differently than
> > standard request handler? Basically given my UI, I'm trying to *hide* the
> > total count from users searching for *everything*, though this syntax has
> > helped me debug/monitor the state of my search doc pool size.
> >
> > Thanks,
> > Michael
> >
> >
> > Alexey-34 wrote:
> >>
> >> I added some kind of pre and post processing of Solr results for this,
> >> i.e.
> >>
> >> If I find fieldname specified in query string in form of
> >> "fieldname:term" then I pass this query string to standard request
> >> handler, otherwise use DisMaxRequestHandler ( DisMaxRequestHandler
> >> doesn't break the query, at least I haven't seen yet ). If standard
> >> request handler throws error ( invalid field, too many clauses, etc )
> >> then I pass original query to DisMax request handler.
> >>
> >> Alex
> >>
> >> On Mon, Nov 9, 2009 at 10:05 PM, michael8 wrote:
> >>>
> >>> Hi Julian,
> >>>
> >>> Saw you post on exactly the question I have. I'm curious if you got any
> >>> response directly, or figured out a way to do this by now that you could
> >>> share? I'm in the same situation trying to 'sanitize' the query string
> >>> coming in before handing it to solr. I do see that characters like ":"
> >>> could break the query, but am curious if anyone has come up with a
> >>> general
> >>> solution as I think this must be a fairly common problem for any solr
> >>> deployment to tackle.
> >>>
> >>> Thanks,
> >>> Michael
> >>>
> >>>
> >>> Julian Davchev wrote:
> >>>>
> >>>> Hi,
> >>>> Is there anything special that can be done for sanitizing user input
> >>>> before passed as query to solr.
> >>>> Not allowing * and ? as first char is only thing I can thing of right
> >>>> now. Anything else it should somehow handle.
> >>>>
> >>>> I am not able to find any relevant document.
> >>>>
> >>>>
> >>>
> >>> --
> >>> View this message in context:
> >>>
> http://old.nabble.com/sanizing-filtering-query-string-for-security-tp21516844p26271891.html
> >>> Sent from the Solr - User mailing list archive at Nabble.com.
> >>>
> >>>
> >>
> >>
> >
> > --
> > View this message in context:
> http://old.nabble.com/sanizing-filtering-query-string-for-security-tp21516844p26274459.html
> > Sent from the Solr - User mailing list archive at Nabble.com.
> >
> >
Re: sanizing/filtering query string for security
Posted by Alexey Serba <as...@gmail.com>.
> BTW, I have not used DisMax handler yet, but does it handle *:* properly?
See q.alt DisMax parameter
http://wiki.apache.org/solr/DisMaxRequestHandler#q.alt
You can specify q.alt=*:* and q as empty string to get all results.
> do you care if users issue this query
I allow users to issue an empty search and get all results with all
facets / etc. It's a nice navigation UI btw.
> Basically given my UI, I'm trying to *hide* the total count from users searching for *everything*
If you don't specify q.alt parameter then Solr returns zero results
for empty search. *:* won't work either.
> though this syntax has helped me debug/monitor the state of my search doc pool size.
see q.alt
Alex
On Tue, Nov 10, 2009 at 12:59 AM, michael8 <mi...@saracatech.com> wrote:
>
> Sounds like a nice approach you have done. BTW, I have not used DisMax
> handler yet, but does it handle *:* properly? IOW, do you care if users
> issue this query, or does DisMax treat this query string differently than
> standard request handler? Basically given my UI, I'm trying to *hide* the
> total count from users searching for *everything*, though this syntax has
> helped me debug/monitor the state of my search doc pool size.
>
> Thanks,
> Michael
>
>
> Alexey-34 wrote:
>>
>> I added some kind of pre and post processing of Solr results for this,
>> i.e.
>>
>> If I find fieldname specified in query string in form of
>> "fieldname:term" then I pass this query string to standard request
>> handler, otherwise use DisMaxRequestHandler ( DisMaxRequestHandler
>> doesn't break the query, at least I haven't seen yet ). If standard
>> request handler throws error ( invalid field, too many clauses, etc )
>> then I pass original query to DisMax request handler.
>>
>> Alex
>>
>> On Mon, Nov 9, 2009 at 10:05 PM, michael8 <mi...@saracatech.com> wrote:
>>>
>>> Hi Julian,
>>>
>>> Saw you post on exactly the question I have. I'm curious if you got any
>>> response directly, or figured out a way to do this by now that you could
>>> share? I'm in the same situation trying to 'sanitize' the query string
>>> coming in before handing it to solr. I do see that characters like ":"
>>> could break the query, but am curious if anyone has come up with a
>>> general
>>> solution as I think this must be a fairly common problem for any solr
>>> deployment to tackle.
>>>
>>> Thanks,
>>> Michael
>>>
>>>
>>> Julian Davchev wrote:
>>>>
>>>> Hi,
>>>> Is there anything special that can be done for sanitizing user input
>>>> before passed as query to solr.
>>>> Not allowing * and ? as first char is only thing I can thing of right
>>>> now. Anything else it should somehow handle.
>>>>
>>>> I am not able to find any relevant document.
>>>>
>>>>
>>>
>>> --
>>> View this message in context:
>>> http://old.nabble.com/sanizing-filtering-query-string-for-security-tp21516844p26271891.html
>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>>
>>>
>>
>>
>
> --
> View this message in context: http://old.nabble.com/sanizing-filtering-query-string-for-security-tp21516844p26274459.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>
Re: sanizing/filtering query string for security
Posted by michael8 <mi...@saracatech.com>.
Sounds like a nice approach you have done. BTW, I have not used DisMax
handler yet, but does it handle *:* properly? IOW, do you care if users
issue this query, or does DisMax treat this query string differently than
standard request handler? Basically given my UI, I'm trying to *hide* the
total count from users searching for *everything*, though this syntax has
helped me debug/monitor the state of my search doc pool size.
Thanks,
Michael
Alexey-34 wrote:
>
> I added some kind of pre and post processing of Solr results for this,
> i.e.
>
> If I find fieldname specified in query string in form of
> "fieldname:term" then I pass this query string to standard request
> handler, otherwise use DisMaxRequestHandler ( DisMaxRequestHandler
> doesn't break the query, at least I haven't seen yet ). If standard
> request handler throws error ( invalid field, too many clauses, etc )
> then I pass original query to DisMax request handler.
>
> Alex
>
> On Mon, Nov 9, 2009 at 10:05 PM, michael8 <mi...@saracatech.com> wrote:
>>
>> Hi Julian,
>>
>> Saw you post on exactly the question I have. I'm curious if you got any
>> response directly, or figured out a way to do this by now that you could
>> share? I'm in the same situation trying to 'sanitize' the query string
>> coming in before handing it to solr. I do see that characters like ":"
>> could break the query, but am curious if anyone has come up with a
>> general
>> solution as I think this must be a fairly common problem for any solr
>> deployment to tackle.
>>
>> Thanks,
>> Michael
>>
>>
>> Julian Davchev wrote:
>>>
>>> Hi,
>>> Is there anything special that can be done for sanitizing user input
>>> before passed as query to solr.
>>> Not allowing * and ? as first char is only thing I can thing of right
>>> now. Anything else it should somehow handle.
>>>
>>> I am not able to find any relevant document.
>>>
>>>
>>
>> --
>> View this message in context:
>> http://old.nabble.com/sanizing-filtering-query-string-for-security-tp21516844p26271891.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
>
>
--
View this message in context: http://old.nabble.com/sanizing-filtering-query-string-for-security-tp21516844p26274459.html
Sent from the Solr - User mailing list archive at Nabble.com.
Re: sanizing/filtering query string for security
Posted by Alexey Serba <as...@gmail.com>.
I added some kind of pre and post processing of Solr results for this, i.e.
If I find fieldname specified in query string in form of
"fieldname:term" then I pass this query string to standard request
handler, otherwise use DisMaxRequestHandler ( DisMaxRequestHandler
doesn't break the query, at least I haven't seen yet ). If standard
request handler throws error ( invalid field, too many clauses, etc )
then I pass original query to DisMax request handler.
Alex
On Mon, Nov 9, 2009 at 10:05 PM, michael8 <mi...@saracatech.com> wrote:
>
> Hi Julian,
>
> Saw you post on exactly the question I have. I'm curious if you got any
> response directly, or figured out a way to do this by now that you could
> share? I'm in the same situation trying to 'sanitize' the query string
> coming in before handing it to solr. I do see that characters like ":"
> could break the query, but am curious if anyone has come up with a general
> solution as I think this must be a fairly common problem for any solr
> deployment to tackle.
>
> Thanks,
> Michael
>
>
> Julian Davchev wrote:
>>
>> Hi,
>> Is there anything special that can be done for sanitizing user input
>> before passed as query to solr.
>> Not allowing * and ? as first char is only thing I can thing of right
>> now. Anything else it should somehow handle.
>>
>> I am not able to find any relevant document.
>>
>>
>
> --
> View this message in context: http://old.nabble.com/sanizing-filtering-query-string-for-security-tp21516844p26271891.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>
Re: sanizing/filtering query string for security
Posted by michael8 <mi...@saracatech.com>.
Hi Julian,
Saw you post on exactly the question I have. I'm curious if you got any
response directly, or figured out a way to do this by now that you could
share? I'm in the same situation trying to 'sanitize' the query string
coming in before handing it to solr. I do see that characters like ":"
could break the query, but am curious if anyone has come up with a general
solution as I think this must be a fairly common problem for any solr
deployment to tackle.
Thanks,
Michael
Julian Davchev wrote:
>
> Hi,
> Is there anything special that can be done for sanitizing user input
> before passed as query to solr.
> Not allowing * and ? as first char is only thing I can thing of right
> now. Anything else it should somehow handle.
>
> I am not able to find any relevant document.
>
>
--
View this message in context: http://old.nabble.com/sanizing-filtering-query-string-for-security-tp21516844p26271891.html
Sent from the Solr - User mailing list archive at Nabble.com.