You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by Brad Neuberg <bk...@columbia.edu> on 2004/09/23 02:34:56 UTC

[feedparser] Patch for Text America support, AOL Journal Atom support, and a refactored TestProbeLocator

Apologies if you get this twice; it was rejected the first time because I 
attached the patch to this email and it was too large.

This email describes several enhancements to the Jakarta Feed Parser.  The 
patch file in unified diff format is located at 
http://codinginparadise.org/feedparser/textamerica.patch.  Here are the 
enhancements:

1) TestProbeLocator.java - I've completely refactored this testing class to 
make it much more maintainable.  It used to be huge, since it included a 
large number of tests.  I've refactored all of the testing code into one 
generic method, testSite, which is called by each blog service that we are 
testing, such as testXanga(), testLiveJournal(), etc.  The code is much 
more readable and understandable now.  I've also added testing for AOL 
Journal's new Atom support (see point 2 below) and for our new Text America 
support (see point 3 below).

2) AOL Journal - AOL Journal recently turned on Atom support (in addition 
to their existing RSS support).  This is exposed in autodiscovery.  I've 
updated the probe locator system to be able to aggresively find it if 
autodiscovery fails (but only if aggresive discovery is turned on, and it 
is off by default).  It turns out that AOL's support for Atom is still a 
bit sketchy; you can do an HTTP GET on the Atom file, but HTTP HEAD 
requests fail with an HTTP 500 internal server error; this does not happen 
for HEAD requests on the RSS file.  This means that aggresive autodiscovery 
will not work for Atom, because it issues a HEAD request instead of a GET 
for performance reasons; I've updated the code to do the probing, but it 
will fail for now.  This doesn't affect anything since it then tries to 
retrieve the RSS file and succeeds.  If AOL fixes this bug then the code 
will automatically find and prefer the Atom feed, which is good.

3)  Text America Support - Text America is a mobile blogging service 
(http://www.textamerica.com).  They have an evil bug, though.  They have 
autodiscovery on their feeds, but it points to the wrong location!  To 
handle this, I added BlogService.TEXTAMERICA, changed BlogService to have a 
property named hasValidAutodiscovery which indicates whether we can trust 
the values given by autodiscovery for a particular blog service, and added 
a hasValidAutodiscovery() method to grab this value.  I then modified 
FeedLocator to always call ProbeLocator, and changed ProbeLocator to "fail 
fast" if we already have some results (found through autodiscovery, for 
example) AND to call blogService.hasValidAutodiscovery().  If this is false 
then we go ahead and do aggresive link probing and clear out the list of 
existing, false autodiscovery links.

I've run TestProbeLocator and everything passes.


Brad Neuberg, bkn3@columbia.edu
Senior Software Engineer, Rojo Networks

=====================================================================

Check out Rojo, an RSS and Atom news aggregator that I work on.  Visit 
http://rojo.com for more info. Feel free to ask me for an invite!

Rojo is Hiring!  If you're interested in RSS, Weblogs, Social Networking, 
Java, Open Source, etc... then come work with us at Rojo.  If you recommend 
someone and we hire them you'll get a free iPod!  See 
http://www.rojonetworks.com/JobsAtRojo.html. 


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org