You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by Brad Neuberg <bk...@columbia.edu> on 2004/09/23 02:34:56 UTC
[feedparser] Patch for Text America support, AOL Journal Atom
support, and a refactored TestProbeLocator
Apologies if you get this twice; it was rejected the first time because I
attached the patch to this email and it was too large.
This email describes several enhancements to the Jakarta Feed Parser. The
patch file in unified diff format is located at
http://codinginparadise.org/feedparser/textamerica.patch. Here are the
enhancements:
1) TestProbeLocator.java - I've completely refactored this testing class to
make it much more maintainable. It used to be huge, since it included a
large number of tests. I've refactored all of the testing code into one
generic method, testSite, which is called by each blog service that we are
testing, such as testXanga(), testLiveJournal(), etc. The code is much
more readable and understandable now. I've also added testing for AOL
Journal's new Atom support (see point 2 below) and for our new Text America
support (see point 3 below).
2) AOL Journal - AOL Journal recently turned on Atom support (in addition
to their existing RSS support). This is exposed in autodiscovery. I've
updated the probe locator system to be able to aggresively find it if
autodiscovery fails (but only if aggresive discovery is turned on, and it
is off by default). It turns out that AOL's support for Atom is still a
bit sketchy; you can do an HTTP GET on the Atom file, but HTTP HEAD
requests fail with an HTTP 500 internal server error; this does not happen
for HEAD requests on the RSS file. This means that aggresive autodiscovery
will not work for Atom, because it issues a HEAD request instead of a GET
for performance reasons; I've updated the code to do the probing, but it
will fail for now. This doesn't affect anything since it then tries to
retrieve the RSS file and succeeds. If AOL fixes this bug then the code
will automatically find and prefer the Atom feed, which is good.
3) Text America Support - Text America is a mobile blogging service
(http://www.textamerica.com). They have an evil bug, though. They have
autodiscovery on their feeds, but it points to the wrong location! To
handle this, I added BlogService.TEXTAMERICA, changed BlogService to have a
property named hasValidAutodiscovery which indicates whether we can trust
the values given by autodiscovery for a particular blog service, and added
a hasValidAutodiscovery() method to grab this value. I then modified
FeedLocator to always call ProbeLocator, and changed ProbeLocator to "fail
fast" if we already have some results (found through autodiscovery, for
example) AND to call blogService.hasValidAutodiscovery(). If this is false
then we go ahead and do aggresive link probing and clear out the list of
existing, false autodiscovery links.
I've run TestProbeLocator and everything passes.
Brad Neuberg, bkn3@columbia.edu
Senior Software Engineer, Rojo Networks
=====================================================================
Check out Rojo, an RSS and Atom news aggregator that I work on. Visit
http://rojo.com for more info. Feel free to ask me for an invite!
Rojo is Hiring! If you're interested in RSS, Weblogs, Social Networking,
Java, Open Source, etc... then come work with us at Rojo. If you recommend
someone and we hire them you'll get a free iPod! See
http://www.rojonetworks.com/JobsAtRojo.html.
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org