You are viewing a plain text version of this content. The canonical link for it is here.
Posted to server-dev@james.apache.org by "Noel J. Bergman" <no...@devtech.com> on 2003/03/17 01:35:21 UTC

TODO: NNTP Changes

We don't want to use a Reader when dealing with message bodies. Readers know
about character encodings, and we don't want to do that for the message
body.

We could take the code that we have for SMTP.  Starting at
SMTPHandler.doDATA(), and going down into the process routines, stripping
out the stuff that is SMTP specific.  We can hide the change within the
NNTPReader class, but we need to change that class' constructor to accept an
InputStream instead of a Reader.

There is a similar problem with writeArticle ... it also uses a Reader.
Again, this is not difficult.  We have the code in POP3Handler.doRETR to
C&P.

I figure that it is a couple of hours worth of work.  Anyone have some time
to work on these changes?

	--- Noel


---------------------------------------------------------------------
To unsubscribe, e-mail: james-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: james-dev-help@jakarta.apache.org


Re: NNTP Changes

Posted by Harmeet Bedi <ha...@kodemuse.com>.
Something wrong in my msg.
Noel corrected me.  From Noel - "conversions except for UTF-8 are 7 bit, and
are irreversible."

Thought I'd send this out in case others note it too.

Harmeet
----- Original Message -----
From: "Harmeet Bedi" <ha...@kodemuse.com>
To: "Stanislav Mironov" <be...@ksu.ru>; "James Developers List"
<ja...@jakarta.apache.org>
Sent: Monday, March 17, 2003 10:10 PM
Subject: Re: NNTP Changes


> ----- Original Message -----
> From: "Noel J. Bergman" <no...@devtech.com>
> Subject: TODO: NNTP Changes
>
>
> > We don't want to use a Reader when dealing with message bodies. Readers
> know
> > about character encodings, and we don't want to do that for the message
> > body.
>
> James seems to be dealing a lot with readers and writers. For example an
> igrep on InternetPrintWriter shows this
>
> ./core/MimeMessageWrapper.java:98:import
> org.apache.james.util.InternetPrintWriter;
> ./core/MimeMessageWrapper.java:288:            PrintWriter pos = new
> InternetPrintWriter(new BufferedWriter(new OutputStreamWriter(headerOs),
> 512), true);
> ./core/MimeMessageWrapper.java:328:            PrintWriter pos = new
> InternetPrintWriter(new BufferedWriter(new OutputStreamWriter(headerOs),
> 512), true);
> ./nntpserver/NNTPHandler.java:84:import
> org.apache.james.util.InternetPrintWriter;
> ./nntpserver/NNTPHandler.java:376:            writer = new
> InternetPrintWriter(new BufferedWriter(new
> OutputStreamWriter(socket.getOutputStream()), 1024), true);
> ./pop3server/POP3Handler.java:85:import
> org.apache.james.util.InternetPrintWriter;
> ./pop3server/POP3Handler.java:298:            out = new
> InternetPrintWriter(outs, true);
> ./smtpserver/SMTPHandler.java:96:import
> org.apache.james.util.InternetPrintWriter;
> ./smtpserver/SMTPHandler.java:389:            out = new
> InternetPrintWriter(new BufferedWriter(new
> OutputStreamWriter(socket.getOutputStream()), 1024), false);
>
> I think it is fine to use Readers and Writers as long as they are used
> consistently. An alternative to removing Reader/Writer usage and only
using
> streams would be to use either Reader/Writers without enforcing a
> ByteToCharacter conversion or to explicitly specify the ByteToCharacter
> conversion everywhere.
>
> We don't seem to be specifying the conversion consistently. For instance
> here is the grep on Reader.
>
> ./core/MailImpl.java:475:            br = new BufferedReader(new
> InputStreamReader(message.getInputStream()));
> ./core/MimeMessageWrapper.java:500:            LineNumberReader counter =
> new LineNumberReader(new InputStreamReader(in, getEncoding()));
> ./nntpserver/NNTPHandler.java:375:            reader = new
> BufferedReader(new InputStreamReader(socket.getInputStream(), "ASCII"),
> 1024);
> ./nntpserver/NNTPHandler.java:515:            doMODEREADER(argument);
> ./nntpserver/NNTPHandler.java:1269:    private void doMODEREADER(String
> argument) {
> ./nntpserver/repository/NNTPArticleImpl.java:138:            reader = new
> BufferedReader(new FileReader(articleFile));
> ./nntpserver/repository/NNTPArticleImpl.java:165:
BufferedReader
> reader = new BufferedReader(new FileReader(articleFile));
> ./nntpserver/repository/NNTPArticleImpl.java:183:
BufferedReader
> reader = new BufferedReader(new FileReader(articleFile));
> ./pop3server/POP3Handler.java:268:            in = new BufferedReader(new
> InputStreamReader(socket.getInputStream(), "ASCII"), 512);
> ./remotemanager/RemoteManagerHandler.java:290:            in = new
> BufferedReader(new InputStreamReader(socket.getInputStream(), "ASCII"),
> 512);
> ./smtpserver/SMTPHandler.java:357:            inReader = new
> BufferedReader(new InputStreamReader(in, "ASCII"), 512);
>
> If replacing the Readers/Writers with Streams is right, I think we should
at
> least replace InternetPrintWriter with InternetPrintStream. Code
conversion
> to completely get rid of Readers and Writers may be significant and there
> may not be a gain. There may be performance gain in replacing
Reader/Writers
> with streams but code does need to deal with lines and strings.
>
> I favor leaving readers and writers as they are but removing the "ASCII"
> encoding rule for conversion in James code or having a consitent character
> conversion mechanism everywhere over removing Readers and Writers.
>
> The way I see it bytes are input and output from James, to either network
or
> disk. Whatever byte to char conversion james uses is transient and is ok
as
> long as it is consistent.
>
> If this is the right approach at least to fix in NNTP would be as
suggested
> be Stanislav Mironov in
>
http://nagoya.apache.org/eyebrowse/ReadMsg?listName=james-user@jakarta.apach
> e.org&msgNo=5445
>
> Message has the patch flipped. Here is the patch from head
> Index: NNTPHandler.java
> ===================================================================
> RCS file:
>
/home/cvs/jakarta-james/src/java/org/apache/james/nntpserver/NNTPHandler.jav
> a,v
> retrieving revision 1.37
> diff -u -r1.37 NNTPHandler.java
> --- NNTPHandler.java 8 Mar 2003 21:14:04 -0000 1.37
> +++ NNTPHandler.java 18 Mar 2003 03:01:44 -0000
> @@ -372,7 +372,7 @@
>              synchronized (this) {
>                  handlerThread = Thread.currentThread();
>              }
> -            reader = new BufferedReader(new
> InputStreamReader(socket.getInputStream(), "ASCII"), 1024);
> +            reader = new BufferedReader(new
> InputStreamReader(socket.getInputStream()), 1024);
>              writer = new InternetPrintWriter(new BufferedWriter(new
> OutputStreamWriter(socket.getOutputStream()), 1024), true);
>          } catch (Exception e) {
>              getLogger().error( "Cannot open connection from: " +
> e.getMessage(), e );
>
> There should be similar fixes in SMTPHandler and POP3Handler. Would be
good
> to remove "ASCII" in reading esp. as we don't specify that conversion
> mechanism in Writing.
>
> thoughts ???
>
> Harmeet
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: james-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: james-dev-help@jakarta.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: james-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: james-dev-help@jakarta.apache.org


RE: NNTP Changes

Posted by "Noel J. Bergman" <no...@devtech.com>.
> Would it make sense to explicitly specify a lossless conversion mechanism
> for bytes->chars->bytes everywhere James uses Readers/Writers?

There are good reasons why JavaMail only uses Reader objects for
configuration info, or when it knows the specific Content-Type charset.

	--- Noel


---------------------------------------------------------------------
To unsubscribe, e-mail: james-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: james-dev-help@jakarta.apache.org


Re: NNTP Changes

Posted by Harmeet Bedi <ha...@kodemuse.com>.
----- Original Message -----
From: "Noel J. Bergman" <no...@devtech.com>
> > I favor leaving readers and writers as they are but removing the "ASCII"
> > encoding rule for conversion in James code or having a consitent
character
> > conversion mechanism everywhere over removing Readers and Writers.
>
> Have you looked at the code to see what happens if you don't specify the
> encoding?  It doesn't do what you apparently think it does.

Would it make sense to explicitly specify a lossless conversion mechanism
for
bytes->chars->bytes everywhere James uses Readers/Writers ?
Or set such a lossless convertor as the default convertor.
bytes->chars->bytes should be consistent across James.

Harmeet


---------------------------------------------------------------------
To unsubscribe, e-mail: james-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: james-dev-help@jakarta.apache.org


RE: NNTP Changes

Posted by "Noel J. Bergman" <no...@devtech.com>.
> James seems to be dealing a lot with readers and writers. For example an
> igrep on InternetPrintWriter shows this

POP3Handler and SMTPHandler use InternetPrintWriter to write response codes
to the client.  MimeMessageWrapper uses it to emit Headers.  There may be an
issue there.  Defer for the moment.

> I think it is fine to use Readers and Writers as long as they are used
> consistently.

With one exception, character encodings are not assured to be reversible.

> I favor leaving readers and writers as they are but removing the "ASCII"
> encoding rule for conversion in James code or having a consitent character
> conversion mechanism everywhere over removing Readers and Writers.

Have you looked at the code to see what happens if you don't specify the
encoding?  It doesn't do what you apparently think it does.

> If this is the right approach at least to fix in NNTP
> would be as suggested be Stanislav Mironov

His patch only works because his system uses cyrillic.  Peter is almost
finished making the changes, and will send them to me for integration.

	--- Noel


---------------------------------------------------------------------
To unsubscribe, e-mail: james-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: james-dev-help@jakarta.apache.org


Re: NNTP Changes

Posted by Harmeet Bedi <ha...@kodemuse.com>.
----- Original Message -----
From: "Noel J. Bergman" <no...@devtech.com>
Subject: TODO: NNTP Changes


> We don't want to use a Reader when dealing with message bodies. Readers
know
> about character encodings, and we don't want to do that for the message
> body.

James seems to be dealing a lot with readers and writers. For example an
igrep on InternetPrintWriter shows this

./core/MimeMessageWrapper.java:98:import
org.apache.james.util.InternetPrintWriter;
./core/MimeMessageWrapper.java:288:            PrintWriter pos = new
InternetPrintWriter(new BufferedWriter(new OutputStreamWriter(headerOs),
512), true);
./core/MimeMessageWrapper.java:328:            PrintWriter pos = new
InternetPrintWriter(new BufferedWriter(new OutputStreamWriter(headerOs),
512), true);
./nntpserver/NNTPHandler.java:84:import
org.apache.james.util.InternetPrintWriter;
./nntpserver/NNTPHandler.java:376:            writer = new
InternetPrintWriter(new BufferedWriter(new
OutputStreamWriter(socket.getOutputStream()), 1024), true);
./pop3server/POP3Handler.java:85:import
org.apache.james.util.InternetPrintWriter;
./pop3server/POP3Handler.java:298:            out = new
InternetPrintWriter(outs, true);
./smtpserver/SMTPHandler.java:96:import
org.apache.james.util.InternetPrintWriter;
./smtpserver/SMTPHandler.java:389:            out = new
InternetPrintWriter(new BufferedWriter(new
OutputStreamWriter(socket.getOutputStream()), 1024), false);

I think it is fine to use Readers and Writers as long as they are used
consistently. An alternative to removing Reader/Writer usage and only using
streams would be to use either Reader/Writers without enforcing a
ByteToCharacter conversion or to explicitly specify the ByteToCharacter
conversion everywhere.

We don't seem to be specifying the conversion consistently. For instance
here is the grep on Reader.

./core/MailImpl.java:475:            br = new BufferedReader(new
InputStreamReader(message.getInputStream()));
./core/MimeMessageWrapper.java:500:            LineNumberReader counter =
new LineNumberReader(new InputStreamReader(in, getEncoding()));
./nntpserver/NNTPHandler.java:375:            reader = new
BufferedReader(new InputStreamReader(socket.getInputStream(), "ASCII"),
1024);
./nntpserver/NNTPHandler.java:515:            doMODEREADER(argument);
./nntpserver/NNTPHandler.java:1269:    private void doMODEREADER(String
argument) {
./nntpserver/repository/NNTPArticleImpl.java:138:            reader = new
BufferedReader(new FileReader(articleFile));
./nntpserver/repository/NNTPArticleImpl.java:165:            BufferedReader
reader = new BufferedReader(new FileReader(articleFile));
./nntpserver/repository/NNTPArticleImpl.java:183:            BufferedReader
reader = new BufferedReader(new FileReader(articleFile));
./pop3server/POP3Handler.java:268:            in = new BufferedReader(new
InputStreamReader(socket.getInputStream(), "ASCII"), 512);
./remotemanager/RemoteManagerHandler.java:290:            in = new
BufferedReader(new InputStreamReader(socket.getInputStream(), "ASCII"),
512);
./smtpserver/SMTPHandler.java:357:            inReader = new
BufferedReader(new InputStreamReader(in, "ASCII"), 512);

If replacing the Readers/Writers with Streams is right, I think we should at
least replace InternetPrintWriter with InternetPrintStream. Code conversion
to completely get rid of Readers and Writers may be significant and there
may not be a gain. There may be performance gain in replacing Reader/Writers
with streams but code does need to deal with lines and strings.

I favor leaving readers and writers as they are but removing the "ASCII"
encoding rule for conversion in James code or having a consitent character
conversion mechanism everywhere over removing Readers and Writers.

The way I see it bytes are input and output from James, to either network or
disk. Whatever byte to char conversion james uses is transient and is ok as
long as it is consistent.

If this is the right approach at least to fix in NNTP would be as suggested
be Stanislav Mironov in
http://nagoya.apache.org/eyebrowse/ReadMsg?listName=james-user@jakarta.apach
e.org&msgNo=5445

Message has the patch flipped. Here is the patch from head
Index: NNTPHandler.java
===================================================================
RCS file:
/home/cvs/jakarta-james/src/java/org/apache/james/nntpserver/NNTPHandler.jav
a,v
retrieving revision 1.37
diff -u -r1.37 NNTPHandler.java
--- NNTPHandler.java 8 Mar 2003 21:14:04 -0000 1.37
+++ NNTPHandler.java 18 Mar 2003 03:01:44 -0000
@@ -372,7 +372,7 @@
             synchronized (this) {
                 handlerThread = Thread.currentThread();
             }
-            reader = new BufferedReader(new
InputStreamReader(socket.getInputStream(), "ASCII"), 1024);
+            reader = new BufferedReader(new
InputStreamReader(socket.getInputStream()), 1024);
             writer = new InternetPrintWriter(new BufferedWriter(new
OutputStreamWriter(socket.getOutputStream()), 1024), true);
         } catch (Exception e) {
             getLogger().error( "Cannot open connection from: " +
e.getMessage(), e );

There should be similar fixes in SMTPHandler and POP3Handler. Would be good
to remove "ASCII" in reading esp. as we don't specify that conversion
mechanism in Writing.

thoughts ???

Harmeet


---------------------------------------------------------------------
To unsubscribe, e-mail: james-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: james-dev-help@jakarta.apache.org


Re: NNTP Changes

Posted by Harmeet Bedi <ha...@kodemuse.com>.
----- Original Message -----
From: "Noel J. Bergman" <no...@devtech.com>
> out the stuff that is SMTP specific.  We can hide the change within the
> NNTPReader class, but we need to change that class' constructor to accept
an
> InputStream instead of a Reader.
>
> There is a similar problem with writeArticle ... it also uses a Reader.

This seems simple to do. Streams seem nicer to use in these case. I had
orig. picked up readers to avoid Java deprecation warnings.

> I figure that it is a couple of hours worth of work.  Anyone have some
time
> to work on these changes?

Hopefully later tonight i'll be able to post patch for review.

Harmeet


---------------------------------------------------------------------
To unsubscribe, e-mail: james-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: james-dev-help@jakarta.apache.org