You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@abdera.apache.org by Kiran Subbaraman <ki...@yahoo.com> on 2007/10/23 08:41:29 UTC

Error parsing certain feeds - troubleshooting steps?

I noticed that some of the feeds are not parsed by Abdera, and also noticed
some discussion on the list related to Feed validation. Based on this, these
are the steps I follow to help determine if issues related to parsing a feed
is an Abdera one, or just the malformed feed.

These are the feeds that I tried:
- http://www.feedsfarm.com/frontpage/health/atom - Does not parse
- http://www.feedsfarm.com/frontpage/health/rss - Parses
- http://feeds.feedburner.com/techtarget/tsscom/home - Does not parse
- http://www.oreillynet.com/pub/feed/1?format=rss1 - Parses
and a few more.

Are these steps sufficient? Or do I need to perform any other checks, that
could help?
Thanks,
Kiran


Step 1
-------
Use FeedValidator, to check the validity of a feed. FeedValidator is
available here: http://feedvalidator.org
* Valid ATOM feeds are processed by Abdera. 
* RSS processing is still being introduced into Abdera, therefore some valid
RSS feeds may still not be processed by Abdera

Step 2
-------
In addition, also use curl to check if the feeds are being served correctly.
curl can be obtained from: http://curl.haxx.se/download.html
Try this: curl -X GET http://news.google.com/?output=atom. In this
particular example, Google will return a 403. Therefore the program needs to
first establish a valid connection with Google, and then proceed with
getting the feed content.

Step 3
-------
Have also created a sample Java program to determine if a feed can be
processed.

import java.io.InputStream;
import java.net.URL;

import org.apache.abdera.Abdera;
import org.apache.abdera.model.Document;
import org.apache.abdera.model.Feed;
import org.apache.abdera.parser.Parser;

public class TestFeed {

	public static void main(String[] args) throws Exception {

		Parser parser = Abdera.getNewParser();
		InputStream input;
		try {
			input = new URL(args[0]).openStream();
			Document doc = (Document) parser.parse(input);
			Feed feed = (Feed) doc.getRoot();
			System.out.println("Feed can be parsed");
			System.out.println("Begin feed content -----");
			feed.writeTo(System.out);
			System.out.println("-------End feed content");
		} catch (Exception e) {
			e.printStackTrace();
		}
	}

}

-- 
View this message in context: http://www.nabble.com/Error-parsing-certain-feeds---troubleshooting-steps--tf4675516.html#a13358334
Sent from the abdera-user mailing list archive at Nabble.com.


Re: Error parsing certain feeds - troubleshooting steps?

Posted by James M Snell <ja...@gmail.com>.
Hello Kiran,

These steps are good. Whenever I come across something that Abdera
cannot parse, my first step is to always run it through the
feedvalidator.  It almost invariably ends up being some error in the
formation of the feed.  If the feed passes the validator, I always
retrieve the feed manually using either curl or a browser to check over
the feed myself.

- James

Kiran Subbaraman wrote:
> I noticed that some of the feeds are not parsed by Abdera, and also noticed
> some discussion on the list related to Feed validation. Based on this, these
> are the steps I follow to help determine if issues related to parsing a feed
> is an Abdera one, or just the malformed feed.
> 
> These are the feeds that I tried:
> - http://www.feedsfarm.com/frontpage/health/atom - Does not parse
> - http://www.feedsfarm.com/frontpage/health/rss - Parses
> - http://feeds.feedburner.com/techtarget/tsscom/home - Does not parse
> - http://www.oreillynet.com/pub/feed/1?format=rss1 - Parses
> and a few more.
> 
> Are these steps sufficient? Or do I need to perform any other checks, that
> could help?
> Thanks,
> Kiran
> 
> 
> Step 1
> -------
> Use FeedValidator, to check the validity of a feed. FeedValidator is
> available here: http://feedvalidator.org
> * Valid ATOM feeds are processed by Abdera. 
> * RSS processing is still being introduced into Abdera, therefore some valid
> RSS feeds may still not be processed by Abdera
> 
> Step 2
> -------
> In addition, also use curl to check if the feeds are being served correctly.
> curl can be obtained from: http://curl.haxx.se/download.html
> Try this: curl -X GET http://news.google.com/?output=atom. In this
> particular example, Google will return a 403. Therefore the program needs to
> first establish a valid connection with Google, and then proceed with
> getting the feed content.
> 
> Step 3
> -------
> Have also created a sample Java program to determine if a feed can be
> processed.
> 
> import java.io.InputStream;
> import java.net.URL;
> 
> import org.apache.abdera.Abdera;
> import org.apache.abdera.model.Document;
> import org.apache.abdera.model.Feed;
> import org.apache.abdera.parser.Parser;
> 
> public class TestFeed {
> 
> 	public static void main(String[] args) throws Exception {
> 
> 		Parser parser = Abdera.getNewParser();
> 		InputStream input;
> 		try {
> 			input = new URL(args[0]).openStream();
> 			Document doc = (Document) parser.parse(input);
> 			Feed feed = (Feed) doc.getRoot();
> 			System.out.println("Feed can be parsed");
> 			System.out.println("Begin feed content -----");
> 			feed.writeTo(System.out);
> 			System.out.println("-------End feed content");
> 		} catch (Exception e) {
> 			e.printStackTrace();
> 		}
> 	}
> 
> }
>