You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@any23.apache.org by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2014/03/26 21:21:15 UTC
[jira] [Commented] (ANY23-168) RDFa properties in elements
not picked up
[ https://issues.apache.org/jira/browse/ANY23-168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13948403#comment-13948403 ]
Lewis John McGibbney commented on ANY23-168:
--------------------------------------------
[~rubenverborgh], did you try to set the boolean configuration property 'any23.extraction.head.meta' to true?
By default org.apache.any23.extractor.html.HTMLMetaExtractor is disabled.
> RDFa properties in <meta> elements not picked up
> ------------------------------------------------
>
> Key: ANY23-168
> URL: https://issues.apache.org/jira/browse/ANY23-168
> Project: Apache Any23
> Issue Type: Bug
> Reporter: Ruben Verborgh
> Labels: meta-tags, rdfa
> Fix For: 1.0.0
>
>
> RDFa annotations in <meta> elements are not picked up:
> http://ruben.verborgh.org/tmp/dctitle-test.html
> http://any23.org/any23/?uri=http%3A%2F%2Fruben.verborgh.org%2Ftmp%2Fdctitle-test.html
> The Structured Data Testing Tool finds them:
> http://www.google.com/webmasters/tools/richsnippets?q=http%3A%2F%2Fruben.verborgh.org%2Ftmp%2Fdctitle-test.html
> Additionally, I wonder whether it's a good idea to drop the dcterms:title property extracted from <title> of an actual dc:title property is present. This allows for more meaningful titles, for instance:
> <title>HTML Title – Website Name</title>
> <meta property="dc:title" content="DC Title"/>
> This would allow to overcome the common situation that the HTML <title> also contains the website name etc., so is not suited for a "clean" dc:title. I would thus say that an actual dc:title has precedence over an implied dc:title from <title>.
> Furthermore, I'm confused by the double appearance of
> <http://ruben.verborgh.org/tmp/dctitle-test.html> dcterms:title "HTML Title – Website Name" .
> AND
> <http://ruben.verborgh.org/tmp/dctitle-test.html> <http://www.w3.org/1999/xhtml/microdata#item> _:nodecfcd208495d565ef66e7dff9f98764da ;
> dcterms:title "HTML Title – Website Name" .
> Should the page itself AND some blank node have this dcterms:title? (And what happens if the <meta> tags are parsed?)
--
This message was sent by Atlassian JIRA
(v6.2#6252)