You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Shalin Shekhar Mangar <sh...@gmail.com> on 2009/11/18 11:45:55 UTC

Re: UTF-8 Character Set not specifed on OutputStreamWriter in StreamingUpdateSolrServer

On Wed, Nov 18, 2009 at 6:56 AM, Joe Kessel <is...@hotmail.com> wrote:

>
> While trying to make use of the StreamingUpdateSolrServer for updates with
> the release code for Solr.14 I noticed some characters such as é did not
> show up in the index correctly.  The code should set the CharsetName via the
> constructor of the OutputStreamWriter.  I noticed that the
> CommonsHttpSolrServer seems to set the charset to UTF-8.  As a workaround I
> am able to use the CommonsHttpSolrServer.  Being new to Solr, not sure what
> the bug protocol is, assuming this is a bug.
>
>
I wrote a simple test case and I'm able to index and query 'é' and other
characters using StreamingUpdateSolrServer. Can you use -Dfile.encoding=UTF8
as a JVM parameter and see if that fixes your case. If it does, then it may
be a Solr bug.

-- 
Regards,
Shalin Shekhar Mangar.

RE: UTF-8 Character Set not specifed on OutputStreamWriter in StreamingUpdateSolrServer

Posted by Joe Kessel <is...@hotmail.com>.
I finally got around to testing the patch and it works well. 

 

Thanks,

Joe
 
> Date: Mon, 23 Nov 2009 12:32:46 -0800
> From: hossman_lucene@fucit.org
> To: solr-user@lucene.apache.org
> Subject: RE: UTF-8 Character Set not specifed on OutputStreamWriter in StreamingUpdateSolrServer
> 
> 
> : Specifying the file.encoding did work, although I don't think it is a 
> : suitable workaround for my use case. Any idea what my next step is to 
> : having a bug opened.
> 
> no, you shouldn't *have* to specifying -Dfile.encoding=UTF8, Shalin was 
> just asking to try that to verify that really was the extent of the 
> problem
> 
> I created a bug to track this...
> https://issues.apache.org/jira/browse/SOLR-1595
> 
> 
> -Hoss
> 
 		 	   		  
_________________________________________________________________
Chat with Messenger straight from your Hotmail inbox.
http://www.microsoft.com/windows/windowslive/hotmail_bl1/hotmail_bl1.aspx?ocid=PID23879::T:WLMTAGL:ON:WL:en-ww:WM_IMHM_4:092009

RE: UTF-8 Character Set not specifed on OutputStreamWriter in StreamingUpdateSolrServer

Posted by Chris Hostetter <ho...@fucit.org>.
: Specifying the file.encoding did work, although I don't think it is a 
: suitable workaround for my use case.  Any idea what my next step is to 
: having a bug opened.

no, you shouldn't *have* to specifying -Dfile.encoding=UTF8, Shalin was 
just asking to try that to verify that really was the extent of the 
problem

I created a bug to track this...
https://issues.apache.org/jira/browse/SOLR-1595


-Hoss


RE: UTF-8 Character Set not specifed on OutputStreamWriter in StreamingUpdateSolrServer

Posted by Joe Kessel <is...@hotmail.com>.
Specifying the file.encoding did work, although I don't think it is a suitable workaround for my use case.  Any idea what my next step is to having a bug opened.

 

Thanks,

Joe
 
> Date: Wed, 18 Nov 2009 16:15:55 +0530
> Subject: Re: UTF-8 Character Set not specifed on OutputStreamWriter in StreamingUpdateSolrServer
> From: shalinmangar@gmail.com
> To: solr-user@lucene.apache.org
> 
> On Wed, Nov 18, 2009 at 6:56 AM, Joe Kessel <is...@hotmail.com> wrote:
> 
> >
> > While trying to make use of the StreamingUpdateSolrServer for updates with
> > the release code for Solr.14 I noticed some characters such as é did not
> > show up in the index correctly. The code should set the CharsetName via the
> > constructor of the OutputStreamWriter. I noticed that the
> > CommonsHttpSolrServer seems to set the charset to UTF-8. As a workaround I
> > am able to use the CommonsHttpSolrServer. Being new to Solr, not sure what
> > the bug protocol is, assuming this is a bug.
> >
> >
> I wrote a simple test case and I'm able to index and query 'é' and other
> characters using StreamingUpdateSolrServer. Can you use -Dfile.encoding=UTF8
> as a JVM parameter and see if that fixes your case. If it does, then it may
> be a Solr bug.
> 
> -- 
> Regards,
> Shalin Shekhar Mangar.
 		 	   		  
_________________________________________________________________
Hotmail: Trusted email with powerful SPAM protection.
http://clk.atdmt.com/GBL/go/177141665/direct/01/