You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@manifoldcf.apache.org by "Benjamin Brandmeier (JIRA)" <ji...@apache.org> on 2013/11/11 11:53:17 UTC

[jira] [Created] (CONNECTORS-805) Crawling author metadata from feeds

Benjamin Brandmeier created CONNECTORS-805:
----------------------------------------------

             Summary: Crawling author metadata from feeds
                 Key: CONNECTORS-805
                 URL: https://issues.apache.org/jira/browse/CONNECTORS-805
             Project: ManifoldCF
          Issue Type: Bug
          Components: RSS connector
    Affects Versions: ManifoldCF 1.4
            Reporter: Benjamin Brandmeier
             Fix For: ManifoldCF 1.5


Functionality for retrieving the author of a RSS entry.

The RSS specifications treat this differently:

RSS 2.0 (Source -> http://www.rssboard.org/rss-specification#ltauthorgtSubelementOfLtitemgt):

<author> sub-element of <item>
It's the email address of the author of the item. For newspapers and magazines syndicating via RSS, the author is the person who wrote the article that the <item> describes.

Atom (Source -> http://www.ietf.org/rfc/rfc4287.txt):

The "atom:author" element is a Person construct that indicates the author of the entry or feed.

atomAuthor = element atom:author { atomPersonConstruct }

If an atom:entry element does not contain atom:author elements, then
the atom:author elements of the contained atom:source element are
considered to apply.  In an Atom Feed Document, the atom:author
elements of the containing atom:feed element are considered to apply
to the entry if there are no atom:author elements in the locations
described above.

The atomPersonConstruct looks like this:

   atomPersonConstruct =
      atomCommonAttributes,
      (element atom:name { text }
       & element atom:uri { atomUri }?
       & element atom:email { atomEmailAddress }?
       & extensionElement*)

where atomCommonAttributes is defined like this:

atomCommonAttributes =
      attribute xml:base { atomUri }?,
      attribute xml:lang { atomLanguageTag }?,
      undefinedAttribute*

Further more there exists a atom:contributor tag:

The "atom:contributor" element is a Person construct that indicates a person or other entity who contributed to the entry or feed.

atomContributor = element atom:contributor { atomPersonConstruct }

For further information please check the specifciation.

Dublin Core (Source -> http://dublincore.org/documents/dcmi-type-vocabulary/index.shtml#elements-creator)

<dc:creator>
The primary individual responsible for the content of the resource.

The element can be at the <item>, <image> or <channel> level.



--
This message was sent by Atlassian JIRA
(v6.1#6144)