You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-dev@lucene.apache.org by "Ryan McKinley (JIRA)" <ji...@apache.org> on 2007/03/31 02:11:25 UTC

[jira] Created: (SOLR-202) parseQueryString should use UTF-8

parseQueryString should use UTF-8
---------------------------------

                 Key: SOLR-202
                 URL: https://issues.apache.org/jira/browse/SOLR-202
             Project: Solr
          Issue Type: Bug
            Reporter: Ryan McKinley
            Priority: Minor
             Fix For: 1.2


update handler should explicitly use UTF-8 decoding  for parameters in the query string:

   URLDecoder.decode( kv, "UTF-8" );

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (SOLR-202) parseQueryString should use UTF-8

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yonik Seeley resolved SOLR-202.
-------------------------------

    Resolution: Fixed

looks good, I just committed.

> parseQueryString should use UTF-8
> ---------------------------------
>
>                 Key: SOLR-202
>                 URL: https://issues.apache.org/jira/browse/SOLR-202
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Ryan McKinley
>            Priority: Minor
>             Fix For: 1.2
>
>         Attachments: SOLR-202-ParseQueryStringUTF8.patch, SOLR-202-ParseQueryStringUTF8.patch
>
>
> update handler should explicitly use UTF-8 decoding  for parameters in the query string:
>    URLDecoder.decode( kv, "UTF-8" );

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-202) parseQueryString should use UTF-8

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12485736 ] 

Yonik Seeley commented on SOLR-202:
-----------------------------------

Didn't have time to look into why, but "ant clean test" fails after applying this patch.

   [junit] Tests run: 3, Failures: 1, Errors: 0, Time elapsed: 4.656 sec
   [junit] Test org.apache.solr.servlet.SolrRequestParserTest FAILED


> parseQueryString should use UTF-8
> ---------------------------------
>
>                 Key: SOLR-202
>                 URL: https://issues.apache.org/jira/browse/SOLR-202
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Ryan McKinley
>            Priority: Minor
>             Fix For: 1.2
>
>         Attachments: SOLR-202-ParseQueryStringUTF8.patch
>
>
> update handler should explicitly use UTF-8 decoding  for parameters in the query string:
>    URLDecoder.decode( kv, "UTF-8" );

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-202) parseQueryString should use UTF-8

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ryan McKinley updated SOLR-202:
-------------------------------

    Attachment: SOLR-202-ParseQueryStringUTF8.patch

simple patch to set "UTF-8" for parametrers in the query string.  
adds a test to make sure 

> parseQueryString should use UTF-8
> ---------------------------------
>
>                 Key: SOLR-202
>                 URL: https://issues.apache.org/jira/browse/SOLR-202
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Ryan McKinley
>            Priority: Minor
>             Fix For: 1.2
>
>         Attachments: SOLR-202-ParseQueryStringUTF8.patch
>
>
> update handler should explicitly use UTF-8 decoding  for parameters in the query string:
>    URLDecoder.decode( kv, "UTF-8" );

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-202) parseQueryString should use UTF-8

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ryan McKinley updated SOLR-202:
-------------------------------

    Attachment: SOLR-202-ParseQueryStringUTF8.patch

this one avoids utf8 chars directly in the .java

It tests:

  { "this is simple", "this%20is%20simple" },
  { "this is simple", "this+is+simple" },
  { "\u00FC", "%C3%BC" },   // lower-case "u" with diaeresis/umlaut
  { "\u0026", "%26" },      // &
  { "\u20AC", "%E2%82%AC" } // euro



> parseQueryString should use UTF-8
> ---------------------------------
>
>                 Key: SOLR-202
>                 URL: https://issues.apache.org/jira/browse/SOLR-202
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Ryan McKinley
>            Priority: Minor
>             Fix For: 1.2
>
>         Attachments: SOLR-202-ParseQueryStringUTF8.patch, SOLR-202-ParseQueryStringUTF8.patch
>
>
> update handler should explicitly use UTF-8 decoding  for parameters in the query string:
>    URLDecoder.decode( kv, "UTF-8" );

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-202) parseQueryString should use UTF-8

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12485757 ] 

Yonik Seeley commented on SOLR-202:
-----------------------------------

> Any pointers on the best way to put non-standard characters into the code without causing encoding problems? 

Let Java do it... replace chars above 127 with unicode escapes (\uxxxx)

> parseQueryString should use UTF-8
> ---------------------------------
>
>                 Key: SOLR-202
>                 URL: https://issues.apache.org/jira/browse/SOLR-202
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Ryan McKinley
>            Priority: Minor
>             Fix For: 1.2
>
>         Attachments: SOLR-202-ParseQueryStringUTF8.patch
>
>
> update handler should explicitly use UTF-8 decoding  for parameters in the query string:
>    URLDecoder.decode( kv, "UTF-8" );

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-202) parseQueryString should use UTF-8

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12485754 ] 

Ryan McKinley commented on SOLR-202:
------------------------------------

Interesting, the test runs fine when i run from eclipse, but fails from command line.  I think this is because I am testing:
  { "颩ÿ", "%C3%A9%C2%A2%C2%A9%C3%BF" }
directly in the java code.  

Any pointers on the best way to put non-standard characters into the code without causing encoding problems?  

We could decode "%C3" then re-encode it...  a bit of a straw-man

> parseQueryString should use UTF-8
> ---------------------------------
>
>                 Key: SOLR-202
>                 URL: https://issues.apache.org/jira/browse/SOLR-202
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Ryan McKinley
>            Priority: Minor
>             Fix For: 1.2
>
>         Attachments: SOLR-202-ParseQueryStringUTF8.patch
>
>
> update handler should explicitly use UTF-8 decoding  for parameters in the query string:
>    URLDecoder.decode( kv, "UTF-8" );

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.