You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Yonik Seeley (JIRA)" <ji...@apache.org> on 2011/04/05 00:27:05 UTC

[jira] [Commented] (SOLR-2456) Filter queries of values with + sign not decoded correctly

    [ https://issues.apache.org/jira/browse/SOLR-2456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13015663#comment-13015663 ] 

Yonik Seeley commented on SOLR-2456:
------------------------------------

There's a lot of different things going on here.  First, let's focus on lucene query syntax and forget about HTTP URL encoding (that's just transfer syntax stuff).

A lucene query of
required_experience:1 to 2 Years
is really equivalent to
required_experience:1 default_field:to default_field:2 default_field:Years

Next, URL encoding is normally an implementation detail.  If Solr started supporting some other transport such as Thrift, there would be no %2b at all.  When the servlet container sees a %2b, it translates it into a "+" before Solr get's it.

There are certain query parsers (qparsers) specifically designed to help out at the lucene syntax level (so you don't have to deal with escaping special query parser chars, double quotes, etc.
http://wiki.apache.org/solr/SolrQuerySyntax

Since you're on 4.0-dev, I'd recommend using the "term" qparser for this:

fq={!term f=required_experience}10+ Years

The benefit is that at the lucene syntax level, there is no escaping whatsoever needed when appending the value you are filtering on.

Now, for the HTTP layer, clients normally take care of the required escaping.  But if you're using something low-level like curl that does not do it for you, then it would look like:

fq={!term%20f=required_experience}10%2b%20Years


> Filter queries of values with + sign not decoded correctly
> ----------------------------------------------------------
>
>                 Key: SOLR-2456
>                 URL: https://issues.apache.org/jira/browse/SOLR-2456
>             Project: Solr
>          Issue Type: Bug
>          Components: search
>    Affects Versions: 4.0
>            Reporter: Scott Kister
>            Priority: Minor
>
> Querying by filters with values containing a + sign does not work as expected. Querying by the quoted value fails. Escaping the + and space without quotes also fails. I did finally get a query to work, but it involved both quoting the value and escaping the +, but not the space.
> Either quoting the value, or escaping should work.
> To reproduce, create a test collection with two documents.
>   "response":{"numFound":2,"start":0,"docs":[{
>         "listing_id":2483808693,
>         "required_experience":["10+ Years"]},{
>         "listing_id":2484835296,
>         "required_experience":["1 to 2 Years"]}]
> These all return 0 results, I believe the first 4 should work.
> ?fq=required_experience:1+to+2+Years
> ?fq=required_experience:1%20to%202%20Years
> ?fq=required_experience:10%2B%20Years
> ?fq=required_experience:"10+ Years"
> ?fq=required_experience:10\+\ Years
> These do work, the second one should not work since %2B is quoted and should not then be urldecoded.
> ?fq=required_experience:"1 to 2 Years"
> ?fq=required_experience:"10%2B Years"
> I tested with the most recent build, apache-solr-4.0-2011-04-01_08-37-23.tgz
> schema.xml for required_experience is
>     <fieldType name="string" class="solr.StrField" sortMissingLast="true" omitNorms="true"/>
>    <field name="required_experience" type="string" indexed="true" />

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org