You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@commons.apache.org by Thufir <ha...@gmail.com> on 2012/03/24 11:44:55 UTC

NNTPClient.retrieveArticleBody returns MalformedServerReplyException

What's the correct way to get an article body?

I'm using java.util.logging.Logger to catch 
org.apache.commons.net.MalformedServerReplyException to a log file:

     15	<record>
     16	  <date>2012-03-24T03:09:35</date>
     17	  <millis>1332583775299</millis>
     18	  <sequence>1</sequence>
     19	  <logger>gwene.LogUtils</logger>
     20	  <level>INFO</level>
     21	  <class>gwene.LogUtils</class>
     22	  <method>logArticles</method>
     23	  <thread>1</thread>
     24	  <message>Could not parse response code.
     25	Server Reply: &lt;p&gt;Alex &amp;#8220;Hurricane&amp;#8221; 
Higgins, transformer of snooker, died on July 24th, aged
...text snipped...
mercilessly, one by one.  ...&lt;/p&gt;&lt;div 
class="feedflare"&gt;</message>
     26	</record>


The server reply is *exactly* what I'm missing, the content of the 
article.  code and full output:

https://gist.github.com/2180843

I'm guessing that the HTML is throwing things off?  What does 
NNTPClient.retrieveArticleBody expect?  After all, anything can be in an 
NNTP post.

Now, what I'm really after, I suppose, is the server reply because that 
has the body of the NNTP article.  However, surely, that's not the way 
to use org.apache.commons.net.nntp.NNTPClient, only I can't find the 
correct way.  Hence this kludge to grab the MalformedServerReply instead 
of parsing it.

I suppose it's possible to log everything, and then parse the log file, 
but that seems like a very complex way of doing a simple thing.

The API documentation for NNTPClient assumes a knowledge of NNTP which, 
unfortunately, I don't have.  I've looked through the example code and 
don't see any samples where article bodies are parsed.  The closest I 
see is NNTPClient.retrieveArticleBody:

https://commons.apache.org/net/api-3.1/org/apache/commons/net/nntp/NNTPClient.html#retrieveArticleBody%28java.lang.String%29

however, that's just malformed content.  Presumably, since Pan can 
connect with gmane fine, that's not the problem.  Also, by looking in 
the Pan newsreader, NNTPClient.retrieveArticleBody results match with 
what I'm after -- namely, the body of the article.

What is the correct way to grab the article body?  I've looked through 
the API quite thoroughly.

Surely there must be an example for parsing the article body, not just 
the header.  Or, at least, using BufferedReader to get the article body 
and assign it to a String.  If so, I don't see a better method available 
through the API.



thanks,

Thufir

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org


Re: NNTPClient.retrieveArticleBody returns MalformedServerReplyException

Posted by Thufir <ha...@gmail.com>.
On Mon, Mar 26, 2012 at 11:14 AM, sebb <se...@gmail.com> wrote:
[...]
> As I already wrote, use NNTPClient.retrieveArticle(long articleNumber).
>
> I tried using it with news.gmane.org and it worked fine.
>
> See the sample app I created recently:
>
> https://svn.apache.org/repos/asf/commons/proper/net/trunk/src/main/java/examples/nntp/ArticleReader.java
>
> This will be added to the site when it is next updated.
[...]

Wow, thank you.  I'll exercise it a bit more, but quite interesting.
Again, thank you.


-Thufir

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org


Re: NNTPClient.retrieveArticleBody returns MalformedServerReplyException

Posted by sebb <se...@gmail.com>.
On 26 March 2012 15:23, Thufir <ha...@gmail.com> wrote:
> On 03/26/2012 06:06 AM, sebb wrote:
>>
>> Try NNTPClient.retrieveArticle(long articleNumber, ArticleInfo
>> pointer)  and NNTPClient.retrieveArticle(long articleNumber)
>
>
> Both of those return with MalformedServerReplyException for me, as before.
>
> I notice that Article API says, about Article, that:
>
> This is a class that contains the basic state needed for message retrieval
> and threading. With thanks to Jamie Zawinski
> https://commons.apache.org/net/api-3.1/org/apache/commons/net/nntp/Article.html
>
> So, the message body is not in the Article, I don't think.

I did not write that it was.

> Maybe I'm
> misunderstanding the docs.  My reading is that some variant of
> NNTPClient.retrieveBody will return the body.

Yes.

> However, there's a note that:
>
> "A DotTerminatedMessageReader is returned from which the article can be
> read. If the article does not exist, null is returned.
>
> You must not issue any commands to the NNTP server (i.e., call any other
> methods) until you finish reading the message from the returned
> BufferedReader instance. The NNTP protocol uses the same stream for issuing
> commands as it does for returning results. Therefore the returned
> BufferedReader actually reads directly from the NNTP connection. After the
> end of message has been reached, new commands can be executed and their
> replies read. If you do not follow these requirements, your program will not
> work properly. "
>
> throughout the NNTPClient documentation, for many methods.
>
> That being said, there are zero examples of retrieving the message body.
>  Maybe it's a completely different approach then retrieving Articles?

As I already wrote, use NNTPClient.retrieveArticle(long articleNumber).

I tried using it with news.gmane.org and it worked fine.

See the sample app I created recently:

https://svn.apache.org/repos/asf/commons/proper/net/trunk/src/main/java/examples/nntp/ArticleReader.java

This will be added to the site when it is next updated.


>
> -Thufir
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
> For additional commands, e-mail: user-help@commons.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org


Re: NNTPClient.retrieveArticleBody returns MalformedServerReplyException

Posted by Thufir <ha...@gmail.com>.
In a more careful looking at the telnet response as so:

https://gist.github.com/2205577

It looks like leafnode, in fact, is correctly appending a status code 
response of 222 and 223 for BODY and NEXT commands.

If Leafnode is returning correct responses, then why are 
MalformedServerReplyException's being thrown?  Is it possible to to look 
at the, even malformed, response somehow?

Based on the telnet output, the responses seem to be correct, at least 
from leafnode.  What's triggering the MalformedServerReplyException?


thanks,

Thufir

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org


Re: NNTPClient.retrieveArticleBody returns MalformedServerReplyException

Posted by Thufir <ha...@gmail.com>.
On 03/26/2012 06:06 AM, sebb wrote:
> Try NNTPClient.retrieveArticle(long articleNumber, ArticleInfo
> pointer)  and NNTPClient.retrieveArticle(long articleNumber)

Both of those return with MalformedServerReplyException for me, as before.

I notice that Article API says, about Article, that:

This is a class that contains the basic state needed for message 
retrieval and threading. With thanks to Jamie Zawinski
https://commons.apache.org/net/api-3.1/org/apache/commons/net/nntp/Article.html

So, the message body is not in the Article, I don't think.  Maybe I'm 
misunderstanding the docs.  My reading is that some variant of 
NNTPClient.retrieveBody will return the body.

However, there's a note that:

"A DotTerminatedMessageReader is returned from which the article can be 
read. If the article does not exist, null is returned.

You must not issue any commands to the NNTP server (i.e., call any other 
methods) until you finish reading the message from the returned 
BufferedReader instance. The NNTP protocol uses the same stream for 
issuing commands as it does for returning results. Therefore the 
returned BufferedReader actually reads directly from the NNTP 
connection. After the end of message has been reached, new commands can 
be executed and their replies read. If you do not follow these 
requirements, your program will not work properly. "

throughout the NNTPClient documentation, for many methods.

That being said, there are zero examples of retrieving the message body. 
  Maybe it's a completely different approach then retrieving Articles?


-Thufir

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org


Re: NNTPClient.retrieveArticleBody returns MalformedServerReplyException

Posted by sebb <se...@gmail.com>.
On 26 March 2012 12:06, Thufir <ha...@gmail.com> wrote:
> On 03/24/2012 04:22 AM, sebb wrote:
> [...]
>
>>> Surely there must be an example for parsing the article body, not just
>>> the
>>> header.  Or, at least, using BufferedReader to get the article body and
>>> assign it to a String.  If so, I don't see a better method available
>>> through
>>> the API.
>>
>>
>> Have a look at the examples in:
>>
>> http://commons.apache.org/net/examples/nntp/
>
>
> I have looked at the examples there, and read the protocol (as best I could)
> as well as experimented with telnet.  I even installed leafnode to
> troubleshoot this.
>
> Pardon, which example shows how to get the body?  The subject, yes, but the
> body is not in the Article class.  The NNTPCommand class looks quite
> promising, but I'm not quite sure how to use it.  The body is, so far as I
> can tell, best available through NNTPClient.retrieveArticleBody(), but this
> doesn't work with leafnode any better than against the gmane server.
>
> Here is the code and output I have:
>
> https://gist.github.com/2170467
>
> It's quite straightforward to get message bodies through telnet, and,
> surely, if it works through telnet it should work through NNTPClient.
> Unfortunately, I don't know which methods correspond to the telnet commands
> in the gist link above.

Try NNTPClient.retrieveArticle(long articleNumber, ArticleInfo
pointer)  and NNTPClient.retrieveArticle(long articleNumber)

>
> thank you,
>
>
> Thufir
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
> For additional commands, e-mail: user-help@commons.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org


Re: NNTPClient.retrieveArticleBody returns MalformedServerReplyException

Posted by Thufir <ha...@gmail.com>.
On 03/24/2012 04:22 AM, sebb wrote:
[...]
>> Surely there must be an example for parsing the article body, not just the
>> header.  Or, at least, using BufferedReader to get the article body and
>> assign it to a String.  If so, I don't see a better method available through
>> the API.
>
> Have a look at the examples in:
>
> http://commons.apache.org/net/examples/nntp/

I have looked at the examples there, and read the protocol (as best I 
could) as well as experimented with telnet.  I even installed leafnode 
to troubleshoot this.

Pardon, which example shows how to get the body?  The subject, yes, but 
the body is not in the Article class.  The NNTPCommand class looks quite 
promising, but I'm not quite sure how to use it.  The body is, so far as 
I can tell, best available through NNTPClient.retrieveArticleBody(), but 
this doesn't work with leafnode any better than against the gmane server.

Here is the code and output I have:

https://gist.github.com/2170467

It's quite straightforward to get message bodies through telnet, and, 
surely, if it works through telnet it should work through NNTPClient. 
Unfortunately, I don't know which methods correspond to the telnet 
commands in the gist link above.


thank you,

Thufir

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org


Re: NNTPClient.retrieveArticleBody returns MalformedServerReplyException

Posted by Thufir <ha...@gmail.com>.
On 03/24/2012 04:22 AM, sebb wrote:
[..]

Thank you very much for your quick response.   I went ahead and 
installed leafnode to pull in a few servers -- they all seem to have 
poorly formed content, though.

> NNTP was defined in http://tools.ietf.org/html/rfc977
>
> See section 3.1.3 which shows that the body content must be preceeded
> by a status reply.
>
> That appears to be missing in the response from the server.

Yeah, I see more what you mean.  There's supposed to be an int preceding 
any response, the status code.

[...]
> Have a look at the examples in:
>
> http://commons.apache.org/net/examples/nntp/

Certainly, and that's where the code:

https://gist.github.com/2180843

All this leads me to infer that it might just be the state of things 
that there's malformed content.

Now, it does, basically, comply with the structure.  So, generally, how 
would I get malformed (and well formed)  data together into a Collection 
of String's?

Am I using the API incorrectly? I think I'm pretty close, so, how to 
proceed?

Any general suggestions?  Updated code at:
https://gist.github.com/2180843





thanks,

Thufir

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org


Re: NNTPClient.retrieveArticleBody returns MalformedServerReplyException

Posted by Thufir <ha...@gmail.com>.
On 03/24/2012 04:22 AM, sebb wrote:
[...]
> NNTP was defined in http://tools.ietf.org/html/rfc977
>
> See section 3.1.3 which shows that the body content must be preceeded
> by a status reply.
>
> That appears to be missing in the response from the server.
[...]

Since Leafnode also appears to not precede responses correctly, I asked 
about its configuration:

http://unix.stackexchange.com/questions/35045/news-server-leafnode-reply-not-proceeded-with-status-reply-code

how to configure that.  If the question is poorly posed, would you let 
me know?  My understanding here is that the problem lies with the 
server, which, in this case, was gmane and is now leafnode.


thanks,

Thufir

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org


Re: NNTPClient.retrieveArticleBody returns MalformedServerReplyException

Posted by sebb <se...@gmail.com>.
On 24 March 2012 10:44, Thufir <ha...@gmail.com> wrote:
> What's the correct way to get an article body?
>
> I'm using java.util.logging.Logger to catch
> org.apache.commons.net.MalformedServerReplyException to a log file:
>
>    15  <record>
>    16    <date>2012-03-24T03:09:35</date>
>    17    <millis>1332583775299</millis>
>    18    <sequence>1</sequence>
>    19    <logger>gwene.LogUtils</logger>
>    20    <level>INFO</level>
>    21    <class>gwene.LogUtils</class>
>    22    <method>logArticles</method>
>    23    <thread>1</thread>
>    24    <message>Could not parse response code.
>    25  Server Reply: &lt;p&gt;Alex &amp;#8220;Hurricane&amp;#8221; Higgins,
> transformer of snooker, died on July 24th, aged
> ...text snipped...
> mercilessly, one by one.  ...&lt;/p&gt;&lt;div
> class="feedflare"&gt;</message>
>    26  </record>
>
>
> The server reply is *exactly* what I'm missing, the content of the article.
>  code and full output:
>
> https://gist.github.com/2180843
>
> I'm guessing that the HTML is throwing things off?  What does
> NNTPClient.retrieveArticleBody expect?  After all, anything can be in an
> NNTP post.

NNTP was defined in http://tools.ietf.org/html/rfc977

See section 3.1.3 which shows that the body content must be preceeded
by a status reply.

That appears to be missing in the response from the server.

> Now, what I'm really after, I suppose, is the server reply because that has
> the body of the NNTP article.  However, surely, that's not the way to use
> org.apache.commons.net.nntp.NNTPClient, only I can't find the correct way.
>  Hence this kludge to grab the MalformedServerReply instead of parsing it.
>
> I suppose it's possible to log everything, and then parse the log file, but
> that seems like a very complex way of doing a simple thing.
>
> The API documentation for NNTPClient assumes a knowledge of NNTP which,
> unfortunately, I don't have.  I've looked through the example code and don't
> see any samples where article bodies are parsed.  The closest I see is
> NNTPClient.retrieveArticleBody:
>
> https://commons.apache.org/net/api-3.1/org/apache/commons/net/nntp/NNTPClient.html#retrieveArticleBody%28java.lang.String%29
>
> however, that's just malformed content.  Presumably, since Pan can connect
> with gmane fine, that's not the problem.  Also, by looking in the Pan
> newsreader, NNTPClient.retrieveArticleBody results match with what I'm after
> -- namely, the body of the article.
>
> What is the correct way to grab the article body?  I've looked through the
> API quite thoroughly.
>
> Surely there must be an example for parsing the article body, not just the
> header.  Or, at least, using BufferedReader to get the article body and
> assign it to a String.  If so, I don't see a better method available through
> the API.

Have a look at the examples in:

http://commons.apache.org/net/examples/nntp/


>
>
> thanks,
>
> Thufir
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
> For additional commands, e-mail: user-help@commons.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org