You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-commits@lucene.apache.org by ry...@apache.org on 2007/05/08 06:23:09 UTC

svn commit: r536048 - /lucene/solr/trunk/src/webapp/src/org/apache/solr/servlet/SolrRequestParsers.java

Author: ryan
Date: Mon May  7 21:23:07 2007
New Revision: 536048

URL: http://svn.apache.org/viewvc?view=rev&rev=536048
Log:
SOLR-214 -- override getReader() explicitly.  subclass called instance variables, not getContentType()

Modified:
    lucene/solr/trunk/src/webapp/src/org/apache/solr/servlet/SolrRequestParsers.java

Modified: lucene/solr/trunk/src/webapp/src/org/apache/solr/servlet/SolrRequestParsers.java
URL: http://svn.apache.org/viewvc/lucene/solr/trunk/src/webapp/src/org/apache/solr/servlet/SolrRequestParsers.java?view=diff&rev=536048&r1=536047&r2=536048
==============================================================================
--- lucene/solr/trunk/src/webapp/src/org/apache/solr/servlet/SolrRequestParsers.java (original)
+++ lucene/solr/trunk/src/webapp/src/org/apache/solr/servlet/SolrRequestParsers.java Mon May  7 21:23:07 2007
@@ -20,6 +20,8 @@
 import java.io.File;
 import java.io.IOException;
 import java.io.InputStream;
+import java.io.InputStreamReader;
+import java.io.Reader;
 import java.io.UnsupportedEncodingException;
 import java.net.URL;
 import java.net.URLDecoder;
@@ -231,20 +233,16 @@
     // Rather than return req.getReader(), this uses the default ContentStreamBase method
     // that checks for charset definitions in the ContentType.
     
-    streams.add( new ContentStreamBase() {
-      @Override
+    streams.add( new ContentStream() {
       public String getContentType() {
         return req.getContentType();
       }
-      @Override
       public String getName() {
         return null; // Is there any meaningful name?
       }
-      @Override
       public String getSourceInfo() {
         return null; // Is there any meaningful source?
       }
-      @Override
       public Long getSize() { 
         String v = req.getHeader( "Content-Length" );
         if( v != null ) {
@@ -254,6 +252,12 @@
       }
       public InputStream getStream() throws IOException {
         return req.getInputStream();
+      }
+      public Reader getReader() throws IOException {
+        String charset = ContentStreamBase.getCharsetFromContentType( req.getContentType() );
+        return charset == null 
+          ? new InputStreamReader( getStream() )
+          : new InputStreamReader( getStream(), charset );
       }
     });
     return SolrRequestParsers.parseQueryString( req.getQueryString() );

Re: svn commit: r536048 - /lucene/solr/trunk/src/webapp/src/org/apache/solr/servlet/SolrRequestParsers.java

Posted by Ryan McKinley <ry...@gmail.com>.

Chris Hostetter wrote:
> A few questions aboutthis...
> 
> : SOLR-214 -- override getReader() explicitly.  subclass called instance
> : variables, not getContentType()
> 
> 1) wouldn't it make sense to make ContentStreamBase use getContentType()
> instead of contentType so that it can be subclassed in this case, instead
> of duplicating hte getReader in every subclass?
> 

agree.


> 2) ...
> 
> : +      }
> : +      public Reader getReader() throws IOException {
> : +        String charset = ContentStreamBase.getCharsetFromContentType( req.getContentType() );
> : +        return charset == null
> : +          ? new InputStreamReader( getStream() )
> : +          : new InputStreamReader( getStream(), charset );
> 
> ...do we really want to use the single arg constructure if the request has
> no charset, or should we assume UTF-8 ? 

I like the suggestion to defaul to 'utf-8' -- this way the behavior is 
well defined.  Otherwise it depends on the platform configuration and 
how the container may (or may not) deal with charset encodings.


ryan

Re: svn commit: r536048 - /lucene/solr/trunk/src/webapp/src/org/apache/solr/servlet/SolrRequestParsers.java

Posted by Chris Hostetter <ho...@fucit.org>.

A few questions aboutthis...

: SOLR-214 -- override getReader() explicitly.  subclass called instance
: variables, not getContentType()

1) wouldn't it make sense to make ContentStreamBase use getContentType()
instead of contentType so that it can be subclassed in this case, instead
of duplicating hte getReader in every subclass?

2) ...

: +      }
: +      public Reader getReader() throws IOException {
: +        String charset = ContentStreamBase.getCharsetFromContentType( req.getContentType() );
: +        return charset == null
: +          ? new InputStreamReader( getStream() )
: +          : new InputStreamReader( getStream(), charset );

...do we really want to use the single arg constructure if the request has
no charset, or should we assume UTF-8 ?  ... i seem to i recall we talked
about this once before and i argued in favor of using the defualt and letting
the servlet container "do the right thing", but i thought the consesus was
to assume UTF-8 if the request itself didn't contain any explicit charset?

these are the two (semi-jumbled) threads i'm thinking of...

http://www.nabble.com/resin-and-UTF-8-in-URLs-tf3152910.html
http://www.nabble.com/charset-in-POST-from-browser-tf3153057.html



-Hoss