You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@stanbol.apache.org by "Walter Kasper (JIRA)" <ji...@apache.org> on 2012/12/19 11:17:12 UTC

[jira] [Resolved] (STANBOL-771) HtmlExtractor: Add an extractor for Microdata

     [ https://issues.apache.org/jira/browse/STANBOL-771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Walter Kasper resolved STANBOL-771.
-----------------------------------

    Resolution: Fixed

Initial support for Microdata extraction from HTML-5 pages is provided. The extractor constructs an RDF representation of the microdata. In general, this is not trivial as microdata are much less formalized than RDF. The problems and some proposals for possible solutions are discussed in http://www.w3.org/TR/microdata-rdf/. Some of the proposals were taken up in this initial version, especially for microdata based on schema.org
                
> HtmlExtractor: Add an extractor for Microdata 
> ----------------------------------------------
>
>                 Key: STANBOL-771
>                 URL: https://issues.apache.org/jira/browse/STANBOL-771
>             Project: Stanbol
>          Issue Type: New Feature
>          Components: Engine - HTML Extractor
>            Reporter: Walter Kasper
>            Assignee: Walter Kasper
>
> An increasing number of sites use Microdata, introduced in HTML5  in addition to Microformats and RDFa annotations for encoding semantic information. The HtmlExtractor should take care of that and extract an RDF representaiton for Microdata.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira