You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@abdera.apache.org by Garrett Rooney <ro...@electricjellyfish.net> on 2006/06/10 20:33:00 UTC

Graceful handling of non-atom 1.0 feeds

In my experiements with pulling titles out of atom feeds last night, I
inadvertently pointed my PrintTitles program at some atom 0.3 feeds.
The results were, well, explosive.

Now I'm not saying we should parse those feeds, we should really
restrict oursives to atom 1.0, but it might be nice if we at least
recognize them when we encounter them, so we can throw something more
informative than a ClassCastException (the usual result) or
NullPointerException (if you've got a ParseFilter set up).

-garrett

Re: Graceful handling of non-atom 1.0 feeds

Posted by James M Snell <ja...@gmail.com>.
One other point, (if it weren't for a bug in the code that I just fixed)
using the Extension Factory mechanism, it would even be possible for
someone to plug Atom 0.3 (and event RSS) support into Abdera with
relative ease.  Once I'm sure the bug is indeed fixed, I'll work up a
complete example that shows an Atom 0.3 feed being parsed and processed
as if it were an Atom 1.0 feed :-)

- James

Paul Querna wrote:
> Garrett Rooney wrote:
>> In my experiements with pulling titles out of atom feeds last night, I
>> inadvertently pointed my PrintTitles program at some atom 0.3 feeds.
>> The results were, well, explosive.
>>
>> Now I'm not saying we should parse those feeds, we should really
>> restrict oursives to atom 1.0, but it might be nice if we at least
>> recognize them when we encounter them, so we can throw something more
>> informative than a ClassCastException (the usual result) or
>> NullPointerException (if you've got a ParseFilter set up).
> 
> +1.
> 
> As more a policy issue, do people think Abdera should attempt to
> successfully parse content, even if they contain errors/violations of
> the spec?
> 
> Someone somewhere out on the Internet will break the spec, produce
> invalid XML, put invalid encodings in there, miss required fields, put
> invalid data in those fields.. etc.  While some of these problems will
> require support from lower level components(Axiom), much of the handling
> stiil is up to Abdera.
> 
> -Paul
> 

Re: Graceful handling of non-atom 1.0 feeds

Posted by Garrett Rooney <ro...@electricjellyfish.net>.
On 6/11/06, James M Snell <ja...@gmail.com> wrote:
> The generics are an API optimization.  The code autodetects what is
> being parsed. In every case, Document.getRoot() returns some derivative
> of the FOM Element interface.  If you know what you're parsing, use the
> generics; if you're not sure what you're parsing, don't use the generics
> and do an instanceof to see what you ended up with.  I'm not sure I see
> the "problem".

Ok, that makes sense.  Sounds like the kind of thing we should be able
to solve with some useful documentation on best practices for using
the APIs.

> Alternatively, plug in an Atom 0.3 Extension Factory and provide a set
> of classes that implement the FOM on top of the Atom 0.3 model.  Then
> you could continue to use the generics form.  By Tuesday I should be
> able to post some sample code that illustrates this...
>
>   Document<Feed> doc = Parser.INSTANCE.parse(atom03FeedStream);
>   Feed feed = doc.getRoot();
>   System.out.println(feed.getClass());   // outputs sample.Atom03Feed

Neat!

-garrett

Re: Graceful handling of non-atom 1.0 feeds

Posted by James M Snell <ja...@gmail.com>.
For now, classloaders are not really an issue, but we'll need to tread
carefully, especially when we start to push further into the server side.

- James

Martin Cooper wrote:
> On 6/11/06, James M Snell <ja...@gmail.com> wrote:
>>
>> The generics are an API optimization.  The code autodetects what is
>> being parsed. In every case, Document.getRoot() returns some derivative
>> of the FOM Element interface.  If you know what you're parsing, use the
>> generics; if you're not sure what you're parsing, don't use the generics
>> and do an instanceof to see what you ended up with.  I'm not sure I see
>> the "problem".
> 
> 
> Using instanceof can fail if the object you are testing was created via a
> different instance of the class than the one you are comparing it to (i.e.
> when multiple class loaders are in play). Is that a non-issue here,
> somehow,
> or do we need to guard against it?
> 
> -- 
> Martin Cooper
> 
> 
> Alternatively, plug in an Atom 0.3 Extension Factory and provide a set
>> of classes that implement the FOM on top of the Atom 0.3 model.  Then
>> you could continue to use the generics form.  By Tuesday I should be
>> able to post some sample code that illustrates this...
>>
>>   Document<Feed> doc = Parser.INSTANCE.parse(atom03FeedStream);
>>   Feed feed = doc.getRoot();
>>   System.out.println(feed.getClass());   // outputs sample.Atom03Feed
>>
>> - James
>>
>> Garrett Rooney wrote:
>> > On 6/10/06, James M Snell <ja...@gmail.com> wrote:
>> >> Abdera will successfully parse any well-formed XML.  The trick is not
>> to
>> >> use generics when parsing.
>> >>
>> >> Document doc = Parser.INSTANCE.parse(someInputStream);
>> >>
>> >> The parser will automatically detect whether the XML stream is an Atom
>> >> document (Feed, Entry or Atom Publishing Protocol Introspection
>> doc) or
>> >> whether it is some other XML.
>> >>
>> >> Element element = doc.getRoot();
>> >>
>> >> if (element instanceof Feed) {
>> >>   // it was an Atom Feed document }
>> >> if (element instanceof Entry) {
>> >>   // it was an Atom Entry document }
>> >> if (element instanceof Service) {
>> >>   // it was an APP Introspection document }
>> >> if (element instanceof ExtensionElement) {
>> >>   // it was arbitrary XML }
>> >>
>> >> More below.
>> >
>> > Makes one wonder what the point of the generics is, if you can't use
>> > them if you want to be able to recover gracefully from such
>> > problems...
>> >
>> > -garrett
>> >
>>
> 

Re: Graceful handling of non-atom 1.0 feeds

Posted by Martin Cooper <ma...@apache.org>.
On 6/11/06, James M Snell <ja...@gmail.com> wrote:
>
> The generics are an API optimization.  The code autodetects what is
> being parsed. In every case, Document.getRoot() returns some derivative
> of the FOM Element interface.  If you know what you're parsing, use the
> generics; if you're not sure what you're parsing, don't use the generics
> and do an instanceof to see what you ended up with.  I'm not sure I see
> the "problem".


Using instanceof can fail if the object you are testing was created via a
different instance of the class than the one you are comparing it to (i.e.
when multiple class loaders are in play). Is that a non-issue here, somehow,
or do we need to guard against it?

--
Martin Cooper


Alternatively, plug in an Atom 0.3 Extension Factory and provide a set
> of classes that implement the FOM on top of the Atom 0.3 model.  Then
> you could continue to use the generics form.  By Tuesday I should be
> able to post some sample code that illustrates this...
>
>   Document<Feed> doc = Parser.INSTANCE.parse(atom03FeedStream);
>   Feed feed = doc.getRoot();
>   System.out.println(feed.getClass());   // outputs sample.Atom03Feed
>
> - James
>
> Garrett Rooney wrote:
> > On 6/10/06, James M Snell <ja...@gmail.com> wrote:
> >> Abdera will successfully parse any well-formed XML.  The trick is not
> to
> >> use generics when parsing.
> >>
> >> Document doc = Parser.INSTANCE.parse(someInputStream);
> >>
> >> The parser will automatically detect whether the XML stream is an Atom
> >> document (Feed, Entry or Atom Publishing Protocol Introspection doc) or
> >> whether it is some other XML.
> >>
> >> Element element = doc.getRoot();
> >>
> >> if (element instanceof Feed) {
> >>   // it was an Atom Feed document }
> >> if (element instanceof Entry) {
> >>   // it was an Atom Entry document }
> >> if (element instanceof Service) {
> >>   // it was an APP Introspection document }
> >> if (element instanceof ExtensionElement) {
> >>   // it was arbitrary XML }
> >>
> >> More below.
> >
> > Makes one wonder what the point of the generics is, if you can't use
> > them if you want to be able to recover gracefully from such
> > problems...
> >
> > -garrett
> >
>

Re: Graceful handling of non-atom 1.0 feeds

Posted by James M Snell <ja...@gmail.com>.
The generics are an API optimization.  The code autodetects what is
being parsed. In every case, Document.getRoot() returns some derivative
of the FOM Element interface.  If you know what you're parsing, use the
generics; if you're not sure what you're parsing, don't use the generics
and do an instanceof to see what you ended up with.  I'm not sure I see
the "problem".

Alternatively, plug in an Atom 0.3 Extension Factory and provide a set
of classes that implement the FOM on top of the Atom 0.3 model.  Then
you could continue to use the generics form.  By Tuesday I should be
able to post some sample code that illustrates this...

  Document<Feed> doc = Parser.INSTANCE.parse(atom03FeedStream);
  Feed feed = doc.getRoot();
  System.out.println(feed.getClass());   // outputs sample.Atom03Feed

- James

Garrett Rooney wrote:
> On 6/10/06, James M Snell <ja...@gmail.com> wrote:
>> Abdera will successfully parse any well-formed XML.  The trick is not to
>> use generics when parsing.
>>
>> Document doc = Parser.INSTANCE.parse(someInputStream);
>>
>> The parser will automatically detect whether the XML stream is an Atom
>> document (Feed, Entry or Atom Publishing Protocol Introspection doc) or
>> whether it is some other XML.
>>
>> Element element = doc.getRoot();
>>
>> if (element instanceof Feed) {
>>   // it was an Atom Feed document }
>> if (element instanceof Entry) {
>>   // it was an Atom Entry document }
>> if (element instanceof Service) {
>>   // it was an APP Introspection document }
>> if (element instanceof ExtensionElement) {
>>   // it was arbitrary XML }
>>
>> More below.
> 
> Makes one wonder what the point of the generics is, if you can't use
> them if you want to be able to recover gracefully from such
> problems...
> 
> -garrett
> 

Re: Graceful handling of non-atom 1.0 feeds

Posted by Garrett Rooney <ro...@electricjellyfish.net>.
On 6/10/06, James M Snell <ja...@gmail.com> wrote:
> Abdera will successfully parse any well-formed XML.  The trick is not to
> use generics when parsing.
>
> Document doc = Parser.INSTANCE.parse(someInputStream);
>
> The parser will automatically detect whether the XML stream is an Atom
> document (Feed, Entry or Atom Publishing Protocol Introspection doc) or
> whether it is some other XML.
>
> Element element = doc.getRoot();
>
> if (element instanceof Feed) {
>   // it was an Atom Feed document }
> if (element instanceof Entry) {
>   // it was an Atom Entry document }
> if (element instanceof Service) {
>   // it was an APP Introspection document }
> if (element instanceof ExtensionElement) {
>   // it was arbitrary XML }
>
> More below.

Makes one wonder what the point of the generics is, if you can't use
them if you want to be able to recover gracefully from such
problems...

-garrett

Re: Graceful handling of non-atom 1.0 feeds

Posted by Elias Torres <el...@torrez.us>.

James M Snell wrote:
> Elias Torres wrote:
>> On 6/11/06, James M Snell <ja...@gmail.com> wrote:
>>> There's actually a very practical use for this arbitrary XML parsing
>>> mechanism that's already in the code.  When you call
>>> content.setValue(...) on an atom:content with XML content, you can pass
>>> in an XML string.  The parser will parse it to create the appropriate
>>> ExtensionElement object to set as the child of the content object. This
>>> also works when setting XHTML content on the Content and Text objects,
>>> making it very simple for us to construct XML and XHTML nodes.
>>>
>>> The API is something like...
>>>
>>>   entry.setContentAsXml("<a><b><c/></b></a>", baseUri);
>> Sure, but that's not as important. That's also the XML parser's job. :)
>>
> 
> The parsers job? The above API is used for building feed and entries,
> not parsing them.

I guess I don't understant what's the feature of setContentAsXML. I was
just trying to say that if we needed to build XML/XHTML nodes, etc, that
should be the DOM api's job and not ours.

> 
>>> Regarding spec compliance, I had been kicking around the idea of a
>>> Validator.INSTANCE.validate(...) mechanism.  You could pass in any of
>>> the FOM objects and have it validate against the spec.  This would also
>>> allow us to configure validators of various strengths and purposes (e.g.
>>> StrictValidator, LiberalValidator, UlterliberalValidator,
>>> MyGdataValidator, Atom03Validator, etc).  It also provides a clean
>>> separation between the parser and the validator.
>>>
>> This is more like what I was looking for, except maybe I'd rather have
>> this at parsing time, no? It'd be too slow to parse, then walk the FOM
>> for validation. Maybe it's the other way around, too slow to do it at
>> parse time, but if we have validation modes, then we would skip
>> validation. Hopefully this is something that can set Abdera apart, a
>> pluggable Atom validation scheme.
>>
> 
> So long as it is turned off by default.  Via the ParserOptions mechanism
> we can have a setValidator/getValidator API that can be used by the
> parser to determine whether or not a given element is acceptable or not.
>  There are a couple of challenges with this however.  One, we either do
> the validation up front, requiring the entire XML stream to be consumed
> right from the start (leading to significantly increased up-front memory
> consumption) or we do the validation incrementally as each element is
> requested leading to validation errors that may not show up unless those
> specific invalid entries are requested.  Also, I want to be able to
> validate newly created elements, e.g.,
> 
>   Entry entry = Factory.INSTANCE.newEntry();
>   // set entry properties
>   Validity v = Validator.INSTANCE.validate(entry);
> 
> In this case, the parser is not involved at all.  It's just the
> validator operating against an in-memory object model.
> 
> The right solution is likely to be a combination of these two approaches.

Right, good points. This is why I'm not sure where it really fits
nicely. I'm just advocating that we give validation the time of day.

> 
> - James
> 

Re: Graceful handling of non-atom 1.0 feeds

Posted by James M Snell <ja...@gmail.com>.
Elias Torres wrote:
> On 6/11/06, James M Snell <ja...@gmail.com> wrote:
>> There's actually a very practical use for this arbitrary XML parsing
>> mechanism that's already in the code.  When you call
>> content.setValue(...) on an atom:content with XML content, you can pass
>> in an XML string.  The parser will parse it to create the appropriate
>> ExtensionElement object to set as the child of the content object. This
>> also works when setting XHTML content on the Content and Text objects,
>> making it very simple for us to construct XML and XHTML nodes.
>>
>> The API is something like...
>>
>>   entry.setContentAsXml("<a><b><c/></b></a>", baseUri);
> 
> Sure, but that's not as important. That's also the XML parser's job. :)
> 

The parsers job? The above API is used for building feed and entries,
not parsing them.

>>
>> Regarding spec compliance, I had been kicking around the idea of a
>> Validator.INSTANCE.validate(...) mechanism.  You could pass in any of
>> the FOM objects and have it validate against the spec.  This would also
>> allow us to configure validators of various strengths and purposes (e.g.
>> StrictValidator, LiberalValidator, UlterliberalValidator,
>> MyGdataValidator, Atom03Validator, etc).  It also provides a clean
>> separation between the parser and the validator.
>>
> 
> This is more like what I was looking for, except maybe I'd rather have
> this at parsing time, no? It'd be too slow to parse, then walk the FOM
> for validation. Maybe it's the other way around, too slow to do it at
> parse time, but if we have validation modes, then we would skip
> validation. Hopefully this is something that can set Abdera apart, a
> pluggable Atom validation scheme.
> 

So long as it is turned off by default.  Via the ParserOptions mechanism
we can have a setValidator/getValidator API that can be used by the
parser to determine whether or not a given element is acceptable or not.
 There are a couple of challenges with this however.  One, we either do
the validation up front, requiring the entire XML stream to be consumed
right from the start (leading to significantly increased up-front memory
consumption) or we do the validation incrementally as each element is
requested leading to validation errors that may not show up unless those
specific invalid entries are requested.  Also, I want to be able to
validate newly created elements, e.g.,

  Entry entry = Factory.INSTANCE.newEntry();
  // set entry properties
  Validity v = Validator.INSTANCE.validate(entry);

In this case, the parser is not involved at all.  It's just the
validator operating against an in-memory object model.

The right solution is likely to be a combination of these two approaches.

- James

Re: Graceful handling of non-atom 1.0 feeds

Posted by Elias Torres <el...@torrez.us>.
On 6/11/06, James M Snell <ja...@gmail.com> wrote:
> There's actually a very practical use for this arbitrary XML parsing
> mechanism that's already in the code.  When you call
> content.setValue(...) on an atom:content with XML content, you can pass
> in an XML string.  The parser will parse it to create the appropriate
> ExtensionElement object to set as the child of the content object. This
> also works when setting XHTML content on the Content and Text objects,
> making it very simple for us to construct XML and XHTML nodes.
>
> The API is something like...
>
>   entry.setContentAsXml("<a><b><c/></b></a>", baseUri);

Sure, but that's not as important. That's also the XML parser's job. :)

>
> Regarding spec compliance, I had been kicking around the idea of a
> Validator.INSTANCE.validate(...) mechanism.  You could pass in any of
> the FOM objects and have it validate against the spec.  This would also
> allow us to configure validators of various strengths and purposes (e.g.
> StrictValidator, LiberalValidator, UlterliberalValidator,
> MyGdataValidator, Atom03Validator, etc).  It also provides a clean
> separation between the parser and the validator.
>

This is more like what I was looking for, except maybe I'd rather have
this at parsing time, no? It'd be too slow to parse, then walk the FOM
for validation. Maybe it's the other way around, too slow to do it at
parse time, but if we have validation modes, then we would skip
validation. Hopefully this is something that can set Abdera apart, a
pluggable Atom validation scheme.

-Elias

Re: Graceful handling of non-atom 1.0 feeds

Posted by James M Snell <ja...@gmail.com>.
There's actually a very practical use for this arbitrary XML parsing
mechanism that's already in the code.  When you call
content.setValue(...) on an atom:content with XML content, you can pass
in an XML string.  The parser will parse it to create the appropriate
ExtensionElement object to set as the child of the content object. This
also works when setting XHTML content on the Content and Text objects,
making it very simple for us to construct XML and XHTML nodes.

The API is something like...

  entry.setContentAsXml("<a><b><c/></b></a>", baseUri);

Regarding spec compliance, I had been kicking around the idea of a
Validator.INSTANCE.validate(...) mechanism.  You could pass in any of
the FOM objects and have it validate against the spec.  This would also
allow us to configure validators of various strengths and purposes (e.g.
StrictValidator, LiberalValidator, UlterliberalValidator,
MyGdataValidator, Atom03Validator, etc).  It also provides a clean
separation between the parser and the validator.

- James

Elias Torres wrote:
> On 6/10/06, James M Snell <ja...@gmail.com> wrote:
>> Abdera will successfully parse any well-formed XML.  The trick is not to
>> use generics when parsing.
>>
> [snip]
>>
>> The parser is currently very liberal.  It will make sure that Atom Date
>> Constructs are at least in iso8601 format and will validate URI's, but
>> everything else is left wide open.  The absolute minimum it requires is
>> well-formed XML.  A broad spectrum of Atom spec violations are allowed.
>>
>> We don't attempt to correct any of those errors, however.  For example,
>> if someone puts escaped HTML markup in a text construct that is marked
>> as text, Abdera will represent that data as plain text.
>>
>> - James
>>
> 
> I'm sort of in the middle on this. If our main goal is to create a
> fully-compliant Atom parser/protocol implementation, why should we
> parsing any feed or XML-document out there? Well, the answer is
> because the world is not perfect as Paul mentioned already. But then
> if we are a liberal parser then I'm afraid we'd become something like
> Universal Feed Parser.
> 
> If anything I propose we add "modes" for parsing in which we can throw
> exceptions and warnings if we see something suspiciously
> non-compliant. We could have strict, liberal, middle-of-the-road modes
> :). This way we serve the community with a well-define role: a
> validating Atom parser/protocol implementation. Maybe something that
> will allow anyone to host their "atom validator" for their site/app
> inside or outside of their company.
> 
> Just thinking outloud. However, it seems that a decision on the matter
> should be central to our project.
> 
> -Elias
> 

Re: Graceful handling of non-atom 1.0 feeds

Posted by Elias Torres <el...@torrez.us>.
On 6/10/06, James M Snell <ja...@gmail.com> wrote:
> Abdera will successfully parse any well-formed XML.  The trick is not to
> use generics when parsing.
>
[snip]
>
> The parser is currently very liberal.  It will make sure that Atom Date
> Constructs are at least in iso8601 format and will validate URI's, but
> everything else is left wide open.  The absolute minimum it requires is
> well-formed XML.  A broad spectrum of Atom spec violations are allowed.
>
> We don't attempt to correct any of those errors, however.  For example,
> if someone puts escaped HTML markup in a text construct that is marked
> as text, Abdera will represent that data as plain text.
>
> - James
>

I'm sort of in the middle on this. If our main goal is to create a
fully-compliant Atom parser/protocol implementation, why should we
parsing any feed or XML-document out there? Well, the answer is
because the world is not perfect as Paul mentioned already. But then
if we are a liberal parser then I'm afraid we'd become something like
Universal Feed Parser.

If anything I propose we add "modes" for parsing in which we can throw
exceptions and warnings if we see something suspiciously
non-compliant. We could have strict, liberal, middle-of-the-road modes
:). This way we serve the community with a well-define role: a
validating Atom parser/protocol implementation. Maybe something that
will allow anyone to host their "atom validator" for their site/app
inside or outside of their company.

Just thinking outloud. However, it seems that a decision on the matter
should be central to our project.

-Elias

Re: Graceful handling of non-atom 1.0 feeds

Posted by James M Snell <ja...@gmail.com>.
Abdera will successfully parse any well-formed XML.  The trick is not to
use generics when parsing.

Document doc = Parser.INSTANCE.parse(someInputStream);

The parser will automatically detect whether the XML stream is an Atom
document (Feed, Entry or Atom Publishing Protocol Introspection doc) or
whether it is some other XML.

Element element = doc.getRoot();

if (element instanceof Feed) {
  // it was an Atom Feed document }
if (element instanceof Entry) {
  // it was an Atom Entry document }
if (element instanceof Service) {
  // it was an APP Introspection document }
if (element instanceof ExtensionElement) {
  // it was arbitrary XML }

More below.

Paul Querna wrote:
> Garrett Rooney wrote:
>> In my experiements with pulling titles out of atom feeds last night, I
>> inadvertently pointed my PrintTitles program at some atom 0.3 feeds.
>> The results were, well, explosive.
>>
>> Now I'm not saying we should parse those feeds, we should really
>> restrict oursives to atom 1.0, but it might be nice if we at least
>> recognize them when we encounter them, so we can throw something more
>> informative than a ClassCastException (the usual result) or
>> NullPointerException (if you've got a ParseFilter set up).
> 

The NPE is likely a bug. That shouldn't happen.  The ClassCastException
is likely caused by the use of generics.  Atom 0.3 and RSS 1.x/2.x will
be parsed as Document<ExtensionElement> (e.g. doc.getRoot() should
return an instance of FOMExtensionElement)

> +1.
> 
> As more a policy issue, do people think Abdera should attempt to
> successfully parse content, even if they contain errors/violations of
> the spec?
> 

The parser is currently very liberal.  It will make sure that Atom Date
Constructs are at least in iso8601 format and will validate URI's, but
everything else is left wide open.  The absolute minimum it requires is
well-formed XML.  A broad spectrum of Atom spec violations are allowed.

We don't attempt to correct any of those errors, however.  For example,
if someone puts escaped HTML markup in a text construct that is marked
as text, Abdera will represent that data as plain text.


> Someone somewhere out on the Internet will break the spec, produce
> invalid XML, put invalid encodings in there, miss required fields, put
> invalid data in those fields.. etc.  While some of these problems will
> require support from lower level components(Axiom), much of the handling
> stiil is up to Abdera.
> 
> -Paul
> 

- James

Re: Graceful handling of non-atom 1.0 feeds

Posted by Paul Querna <pq...@apache.org>.
Garrett Rooney wrote:
> In my experiements with pulling titles out of atom feeds last night, I
> inadvertently pointed my PrintTitles program at some atom 0.3 feeds.
> The results were, well, explosive.
> 
> Now I'm not saying we should parse those feeds, we should really
> restrict oursives to atom 1.0, but it might be nice if we at least
> recognize them when we encounter them, so we can throw something more
> informative than a ClassCastException (the usual result) or
> NullPointerException (if you've got a ParseFilter set up).

+1.

As more a policy issue, do people think Abdera should attempt to 
successfully parse content, even if they contain errors/violations of 
the spec?

Someone somewhere out on the Internet will break the spec, produce 
invalid XML, put invalid encodings in there, miss required fields, put 
invalid data in those fields.. etc.  While some of these problems will 
require support from lower level components(Axiom), much of the handling 
stiil is up to Abdera.

-Paul

Re: Graceful handling of non-atom 1.0 feeds

Posted by James M Snell <ja...@gmail.com>.
The parser will actually parse any arbitrary XML document.  Try the
parse without the generic, e.g.

Document doc = Parser.INSTANCE.parse(...)

doc.getRoot() will return an instance of ExtensionElement rather than
Feed.  This would be the right way of parsing in the case you're not
exactly sure what you're point at.  You can then do an instanceof to
determine if you ended up with the right thing.

The NPE sounds like a bug.  That shouldn't happen.

- James

Garrett Rooney wrote:
> In my experiements with pulling titles out of atom feeds last night, I
> inadvertently pointed my PrintTitles program at some atom 0.3 feeds.
> The results were, well, explosive.
> 
> Now I'm not saying we should parse those feeds, we should really
> restrict oursives to atom 1.0, but it might be nice if we at least
> recognize them when we encounter them, so we can throw something more
> informative than a ClassCastException (the usual result) or
> NullPointerException (if you've got a ParseFilter set up).
> 
> -garrett
>