You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@manifoldcf.apache.org by "Benjamin Brandmeier (JIRA)" <ji...@apache.org> on 2013/11/11 11:53:17 UTC
[jira] [Created] (CONNECTORS-805) Crawling author metadata from
feeds
Benjamin Brandmeier created CONNECTORS-805:
----------------------------------------------
Summary: Crawling author metadata from feeds
Key: CONNECTORS-805
URL: https://issues.apache.org/jira/browse/CONNECTORS-805
Project: ManifoldCF
Issue Type: Bug
Components: RSS connector
Affects Versions: ManifoldCF 1.4
Reporter: Benjamin Brandmeier
Fix For: ManifoldCF 1.5
Functionality for retrieving the author of a RSS entry.
The RSS specifications treat this differently:
RSS 2.0 (Source -> http://www.rssboard.org/rss-specification#ltauthorgtSubelementOfLtitemgt):
<author> sub-element of <item>
It's the email address of the author of the item. For newspapers and magazines syndicating via RSS, the author is the person who wrote the article that the <item> describes.
Atom (Source -> http://www.ietf.org/rfc/rfc4287.txt):
The "atom:author" element is a Person construct that indicates the author of the entry or feed.
atomAuthor = element atom:author { atomPersonConstruct }
If an atom:entry element does not contain atom:author elements, then
the atom:author elements of the contained atom:source element are
considered to apply. In an Atom Feed Document, the atom:author
elements of the containing atom:feed element are considered to apply
to the entry if there are no atom:author elements in the locations
described above.
The atomPersonConstruct looks like this:
atomPersonConstruct =
atomCommonAttributes,
(element atom:name { text }
& element atom:uri { atomUri }?
& element atom:email { atomEmailAddress }?
& extensionElement*)
where atomCommonAttributes is defined like this:
atomCommonAttributes =
attribute xml:base { atomUri }?,
attribute xml:lang { atomLanguageTag }?,
undefinedAttribute*
Further more there exists a atom:contributor tag:
The "atom:contributor" element is a Person construct that indicates a person or other entity who contributed to the entry or feed.
atomContributor = element atom:contributor { atomPersonConstruct }
For further information please check the specifciation.
Dublin Core (Source -> http://dublincore.org/documents/dcmi-type-vocabulary/index.shtml#elements-creator)
<dc:creator>
The primary individual responsible for the content of the resource.
The element can be at the <item>, <image> or <channel> level.
--
This message was sent by Atlassian JIRA
(v6.1#6144)