You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@any23.apache.org by "David Cockbill (Jira)" <ji...@apache.org> on 2019/09/18 13:04:00 UTC

[jira] [Commented] (ANY23-428) RDFa parse issue if vocab not defined with training slash

    [ https://issues.apache.org/jira/browse/ANY23-428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16932419#comment-16932419 ] 

David Cockbill commented on ANY23-428:
--------------------------------------

I have reproduced this using a test. I have checked the test into my personal fork (which I hope is visible to all :) ):

[https://github.com/davidcockbill/any23/tree/ANY23-428]

The fix, I believe, needs to be in the [Semarglproject |https://github.com/semarglproject/semargl]. In particular in "rdfa/src/main/java/org/semarglproject/rdf/rdfa/Vocabulary.java".

I changed the resolveTerm() method as follows:

 
{code:java}
    String resolveTerm(final String term) {
    	final String separator = (url.endsWith("/") || url.endsWith("#")) ? "" : "/";
        final String termUri = url + separator + term;
        if (terms == null && RIUtils.isAbsoluteIri(termUri) || terms != null && terms.contains(termUri)) {
            return termUri;
        }
        return null;
    }
{code}
 

This seems to fix the issue; and the semarglproject unit tests pass. However, I'm not too sure if there are other issues with this, and whether we should use a URI library for creating the url rather than my crude string parsing.

I'll need some advice on how to proceed. Presumably creating an issue on the semarglproject, getting that fixed and merged, getting a new release, then pulling in the new version into ANY23?

 

 

 

> RDFa parse issue if vocab not defined with training slash
> ---------------------------------------------------------
>
>                 Key: ANY23-428
>                 URL: https://issues.apache.org/jira/browse/ANY23-428
>             Project: Apache Any23
>          Issue Type: Bug
>          Components: extractors
>    Affects Versions: 2.3
>            Reporter: David Cockbill
>            Priority: Minor
>
> If a RDFa vocab URL is missing a trailing forward slash, then the properties are not expanded correctly.
> For example:
>  
> {code:java}
> <ol vocab="https://schema.org" typeof="BreadcrumbList">
> {code}
> rather than:
>  
> {code:java}
> <ol vocab="https://schema.org/" typeof="BreadcrumbList">
> {code}
> produces properties that look (in nTriples) as follows:
>  
>  
> {code:java}
> <http://example.com> <http://www.w3.org/ns/rdfa#usesVocabulary> <http://schema.org> .
> _:n0 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.orgBreadcrumbList> .
> _:n1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.orgListItem> .
> {code}
>  
>  
> I'm sure the intention should be to join the properties and vocab with a forward slash.
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)