You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oro-user@jakarta.apache.org by Harald Kuhn <ha...@ontopia.net> on 2002/06/24 15:17:54 UTC

MatchActionProcessor and character encodings

Hi 

i am new on this list and currently working on parsing csv (comma seperated) 
files using the MatchActionProcessor.

The files are in ISO-8859-1 character encoding and i am working on Debian 
Linux with ascii as standard encoding. 
i used the processMatches(InputStream input, OutputStream output) method, but 
it broke the no ascii characters. 
Looking into the source code (from the CVS) i saw that you are using a 
LineNumberReader and an InputStreamReader to wrap the InputStream with the 
InputStreamReader(InputStream in) constructor which uses the system default 
encoding.

I changed the code to 

	processMatches(Reader input, OutputStream output) 

and used a InputStreamReader where i specified ISO-8859-1 encoding by the 
constructor which works perfectly fine.


The question is, wether it would be possible, to overload  
processMatches(...), adding a version with either a 3rd parameter (String) to 
specify the encoding of the InputStreamReader like 

	processMatches(InputStream input, OutputStream output, String encoding)

or one that uses a Reader instead of an InputStream like i did ?

Most of the time, the existing method works fine but as i saw frequently 
questions concerning unicode etc in the archive, this could be interesting 
for other users as well.

(i did not want to post the complete sourcecode on this list to keep the  
mail as short as possible but would do so if you are interested in).

Harald







--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>