You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@commons.apache.org by Thufir <ha...@gmail.com> on 2012/03/24 11:44:55 UTC
NNTPClient.retrieveArticleBody returns MalformedServerReplyException
What's the correct way to get an article body?
I'm using java.util.logging.Logger to catch
org.apache.commons.net.MalformedServerReplyException to a log file:
15 <record>
16 <date>2012-03-24T03:09:35</date>
17 <millis>1332583775299</millis>
18 <sequence>1</sequence>
19 <logger>gwene.LogUtils</logger>
20 <level>INFO</level>
21 <class>gwene.LogUtils</class>
22 <method>logArticles</method>
23 <thread>1</thread>
24 <message>Could not parse response code.
25 Server Reply: <p>Alex &#8220;Hurricane&#8221;
Higgins, transformer of snooker, died on July 24th, aged
...text snipped...
mercilessly, one by one. ...</p><div
class="feedflare"></message>
26 </record>
The server reply is *exactly* what I'm missing, the content of the
article. code and full output:
https://gist.github.com/2180843
I'm guessing that the HTML is throwing things off? What does
NNTPClient.retrieveArticleBody expect? After all, anything can be in an
NNTP post.
Now, what I'm really after, I suppose, is the server reply because that
has the body of the NNTP article. However, surely, that's not the way
to use org.apache.commons.net.nntp.NNTPClient, only I can't find the
correct way. Hence this kludge to grab the MalformedServerReply instead
of parsing it.
I suppose it's possible to log everything, and then parse the log file,
but that seems like a very complex way of doing a simple thing.
The API documentation for NNTPClient assumes a knowledge of NNTP which,
unfortunately, I don't have. I've looked through the example code and
don't see any samples where article bodies are parsed. The closest I
see is NNTPClient.retrieveArticleBody:
https://commons.apache.org/net/api-3.1/org/apache/commons/net/nntp/NNTPClient.html#retrieveArticleBody%28java.lang.String%29
however, that's just malformed content. Presumably, since Pan can
connect with gmane fine, that's not the problem. Also, by looking in
the Pan newsreader, NNTPClient.retrieveArticleBody results match with
what I'm after -- namely, the body of the article.
What is the correct way to grab the article body? I've looked through
the API quite thoroughly.
Surely there must be an example for parsing the article body, not just
the header. Or, at least, using BufferedReader to get the article body
and assign it to a String. If so, I don't see a better method available
through the API.
thanks,
Thufir
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org
Re: NNTPClient.retrieveArticleBody returns MalformedServerReplyException
Posted by Thufir <ha...@gmail.com>.
On Mon, Mar 26, 2012 at 11:14 AM, sebb <se...@gmail.com> wrote:
[...]
> As I already wrote, use NNTPClient.retrieveArticle(long articleNumber).
>
> I tried using it with news.gmane.org and it worked fine.
>
> See the sample app I created recently:
>
> https://svn.apache.org/repos/asf/commons/proper/net/trunk/src/main/java/examples/nntp/ArticleReader.java
>
> This will be added to the site when it is next updated.
[...]
Wow, thank you. I'll exercise it a bit more, but quite interesting.
Again, thank you.
-Thufir
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org
Re: NNTPClient.retrieveArticleBody returns MalformedServerReplyException
Posted by sebb <se...@gmail.com>.
On 26 March 2012 15:23, Thufir <ha...@gmail.com> wrote:
> On 03/26/2012 06:06 AM, sebb wrote:
>>
>> Try NNTPClient.retrieveArticle(long articleNumber, ArticleInfo
>> pointer) and NNTPClient.retrieveArticle(long articleNumber)
>
>
> Both of those return with MalformedServerReplyException for me, as before.
>
> I notice that Article API says, about Article, that:
>
> This is a class that contains the basic state needed for message retrieval
> and threading. With thanks to Jamie Zawinski
> https://commons.apache.org/net/api-3.1/org/apache/commons/net/nntp/Article.html
>
> So, the message body is not in the Article, I don't think.
I did not write that it was.
> Maybe I'm
> misunderstanding the docs. My reading is that some variant of
> NNTPClient.retrieveBody will return the body.
Yes.
> However, there's a note that:
>
> "A DotTerminatedMessageReader is returned from which the article can be
> read. If the article does not exist, null is returned.
>
> You must not issue any commands to the NNTP server (i.e., call any other
> methods) until you finish reading the message from the returned
> BufferedReader instance. The NNTP protocol uses the same stream for issuing
> commands as it does for returning results. Therefore the returned
> BufferedReader actually reads directly from the NNTP connection. After the
> end of message has been reached, new commands can be executed and their
> replies read. If you do not follow these requirements, your program will not
> work properly. "
>
> throughout the NNTPClient documentation, for many methods.
>
> That being said, there are zero examples of retrieving the message body.
> Maybe it's a completely different approach then retrieving Articles?
As I already wrote, use NNTPClient.retrieveArticle(long articleNumber).
I tried using it with news.gmane.org and it worked fine.
See the sample app I created recently:
https://svn.apache.org/repos/asf/commons/proper/net/trunk/src/main/java/examples/nntp/ArticleReader.java
This will be added to the site when it is next updated.
>
> -Thufir
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
> For additional commands, e-mail: user-help@commons.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org
Re: NNTPClient.retrieveArticleBody returns MalformedServerReplyException
Posted by Thufir <ha...@gmail.com>.
In a more careful looking at the telnet response as so:
https://gist.github.com/2205577
It looks like leafnode, in fact, is correctly appending a status code
response of 222 and 223 for BODY and NEXT commands.
If Leafnode is returning correct responses, then why are
MalformedServerReplyException's being thrown? Is it possible to to look
at the, even malformed, response somehow?
Based on the telnet output, the responses seem to be correct, at least
from leafnode. What's triggering the MalformedServerReplyException?
thanks,
Thufir
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org
Re: NNTPClient.retrieveArticleBody returns MalformedServerReplyException
Posted by Thufir <ha...@gmail.com>.
On 03/26/2012 06:06 AM, sebb wrote:
> Try NNTPClient.retrieveArticle(long articleNumber, ArticleInfo
> pointer) and NNTPClient.retrieveArticle(long articleNumber)
Both of those return with MalformedServerReplyException for me, as before.
I notice that Article API says, about Article, that:
This is a class that contains the basic state needed for message
retrieval and threading. With thanks to Jamie Zawinski
https://commons.apache.org/net/api-3.1/org/apache/commons/net/nntp/Article.html
So, the message body is not in the Article, I don't think. Maybe I'm
misunderstanding the docs. My reading is that some variant of
NNTPClient.retrieveBody will return the body.
However, there's a note that:
"A DotTerminatedMessageReader is returned from which the article can be
read. If the article does not exist, null is returned.
You must not issue any commands to the NNTP server (i.e., call any other
methods) until you finish reading the message from the returned
BufferedReader instance. The NNTP protocol uses the same stream for
issuing commands as it does for returning results. Therefore the
returned BufferedReader actually reads directly from the NNTP
connection. After the end of message has been reached, new commands can
be executed and their replies read. If you do not follow these
requirements, your program will not work properly. "
throughout the NNTPClient documentation, for many methods.
That being said, there are zero examples of retrieving the message body.
Maybe it's a completely different approach then retrieving Articles?
-Thufir
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org
Re: NNTPClient.retrieveArticleBody returns MalformedServerReplyException
Posted by sebb <se...@gmail.com>.
On 26 March 2012 12:06, Thufir <ha...@gmail.com> wrote:
> On 03/24/2012 04:22 AM, sebb wrote:
> [...]
>
>>> Surely there must be an example for parsing the article body, not just
>>> the
>>> header. Or, at least, using BufferedReader to get the article body and
>>> assign it to a String. If so, I don't see a better method available
>>> through
>>> the API.
>>
>>
>> Have a look at the examples in:
>>
>> http://commons.apache.org/net/examples/nntp/
>
>
> I have looked at the examples there, and read the protocol (as best I could)
> as well as experimented with telnet. I even installed leafnode to
> troubleshoot this.
>
> Pardon, which example shows how to get the body? The subject, yes, but the
> body is not in the Article class. The NNTPCommand class looks quite
> promising, but I'm not quite sure how to use it. The body is, so far as I
> can tell, best available through NNTPClient.retrieveArticleBody(), but this
> doesn't work with leafnode any better than against the gmane server.
>
> Here is the code and output I have:
>
> https://gist.github.com/2170467
>
> It's quite straightforward to get message bodies through telnet, and,
> surely, if it works through telnet it should work through NNTPClient.
> Unfortunately, I don't know which methods correspond to the telnet commands
> in the gist link above.
Try NNTPClient.retrieveArticle(long articleNumber, ArticleInfo
pointer) and NNTPClient.retrieveArticle(long articleNumber)
>
> thank you,
>
>
> Thufir
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
> For additional commands, e-mail: user-help@commons.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org
Re: NNTPClient.retrieveArticleBody returns MalformedServerReplyException
Posted by Thufir <ha...@gmail.com>.
On 03/24/2012 04:22 AM, sebb wrote:
[...]
>> Surely there must be an example for parsing the article body, not just the
>> header. Or, at least, using BufferedReader to get the article body and
>> assign it to a String. If so, I don't see a better method available through
>> the API.
>
> Have a look at the examples in:
>
> http://commons.apache.org/net/examples/nntp/
I have looked at the examples there, and read the protocol (as best I
could) as well as experimented with telnet. I even installed leafnode
to troubleshoot this.
Pardon, which example shows how to get the body? The subject, yes, but
the body is not in the Article class. The NNTPCommand class looks quite
promising, but I'm not quite sure how to use it. The body is, so far as
I can tell, best available through NNTPClient.retrieveArticleBody(), but
this doesn't work with leafnode any better than against the gmane server.
Here is the code and output I have:
https://gist.github.com/2170467
It's quite straightforward to get message bodies through telnet, and,
surely, if it works through telnet it should work through NNTPClient.
Unfortunately, I don't know which methods correspond to the telnet
commands in the gist link above.
thank you,
Thufir
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org
Re: NNTPClient.retrieveArticleBody returns MalformedServerReplyException
Posted by Thufir <ha...@gmail.com>.
On 03/24/2012 04:22 AM, sebb wrote:
[..]
Thank you very much for your quick response. I went ahead and
installed leafnode to pull in a few servers -- they all seem to have
poorly formed content, though.
> NNTP was defined in http://tools.ietf.org/html/rfc977
>
> See section 3.1.3 which shows that the body content must be preceeded
> by a status reply.
>
> That appears to be missing in the response from the server.
Yeah, I see more what you mean. There's supposed to be an int preceding
any response, the status code.
[...]
> Have a look at the examples in:
>
> http://commons.apache.org/net/examples/nntp/
Certainly, and that's where the code:
https://gist.github.com/2180843
All this leads me to infer that it might just be the state of things
that there's malformed content.
Now, it does, basically, comply with the structure. So, generally, how
would I get malformed (and well formed) data together into a Collection
of String's?
Am I using the API incorrectly? I think I'm pretty close, so, how to
proceed?
Any general suggestions? Updated code at:
https://gist.github.com/2180843
thanks,
Thufir
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org
Re: NNTPClient.retrieveArticleBody returns MalformedServerReplyException
Posted by Thufir <ha...@gmail.com>.
On 03/24/2012 04:22 AM, sebb wrote:
[...]
> NNTP was defined in http://tools.ietf.org/html/rfc977
>
> See section 3.1.3 which shows that the body content must be preceeded
> by a status reply.
>
> That appears to be missing in the response from the server.
[...]
Since Leafnode also appears to not precede responses correctly, I asked
about its configuration:
http://unix.stackexchange.com/questions/35045/news-server-leafnode-reply-not-proceeded-with-status-reply-code
how to configure that. If the question is poorly posed, would you let
me know? My understanding here is that the problem lies with the
server, which, in this case, was gmane and is now leafnode.
thanks,
Thufir
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org
Re: NNTPClient.retrieveArticleBody returns MalformedServerReplyException
Posted by sebb <se...@gmail.com>.
On 24 March 2012 10:44, Thufir <ha...@gmail.com> wrote:
> What's the correct way to get an article body?
>
> I'm using java.util.logging.Logger to catch
> org.apache.commons.net.MalformedServerReplyException to a log file:
>
> 15 <record>
> 16 <date>2012-03-24T03:09:35</date>
> 17 <millis>1332583775299</millis>
> 18 <sequence>1</sequence>
> 19 <logger>gwene.LogUtils</logger>
> 20 <level>INFO</level>
> 21 <class>gwene.LogUtils</class>
> 22 <method>logArticles</method>
> 23 <thread>1</thread>
> 24 <message>Could not parse response code.
> 25 Server Reply: <p>Alex &#8220;Hurricane&#8221; Higgins,
> transformer of snooker, died on July 24th, aged
> ...text snipped...
> mercilessly, one by one. ...</p><div
> class="feedflare"></message>
> 26 </record>
>
>
> The server reply is *exactly* what I'm missing, the content of the article.
> code and full output:
>
> https://gist.github.com/2180843
>
> I'm guessing that the HTML is throwing things off? What does
> NNTPClient.retrieveArticleBody expect? After all, anything can be in an
> NNTP post.
NNTP was defined in http://tools.ietf.org/html/rfc977
See section 3.1.3 which shows that the body content must be preceeded
by a status reply.
That appears to be missing in the response from the server.
> Now, what I'm really after, I suppose, is the server reply because that has
> the body of the NNTP article. However, surely, that's not the way to use
> org.apache.commons.net.nntp.NNTPClient, only I can't find the correct way.
> Hence this kludge to grab the MalformedServerReply instead of parsing it.
>
> I suppose it's possible to log everything, and then parse the log file, but
> that seems like a very complex way of doing a simple thing.
>
> The API documentation for NNTPClient assumes a knowledge of NNTP which,
> unfortunately, I don't have. I've looked through the example code and don't
> see any samples where article bodies are parsed. The closest I see is
> NNTPClient.retrieveArticleBody:
>
> https://commons.apache.org/net/api-3.1/org/apache/commons/net/nntp/NNTPClient.html#retrieveArticleBody%28java.lang.String%29
>
> however, that's just malformed content. Presumably, since Pan can connect
> with gmane fine, that's not the problem. Also, by looking in the Pan
> newsreader, NNTPClient.retrieveArticleBody results match with what I'm after
> -- namely, the body of the article.
>
> What is the correct way to grab the article body? I've looked through the
> API quite thoroughly.
>
> Surely there must be an example for parsing the article body, not just the
> header. Or, at least, using BufferedReader to get the article body and
> assign it to a String. If so, I don't see a better method available through
> the API.
Have a look at the examples in:
http://commons.apache.org/net/examples/nntp/
>
>
> thanks,
>
> Thufir
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
> For additional commands, e-mail: user-help@commons.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org