You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jena.apache.org by Andreas Kahl <ka...@bsb-muenchen.de> on 2018/04/03 13:04:02 UTC

Representing Sequences/Lists in RDF/XML and iterating in Jena

 Hello everyone,

we have the requirement to preserve the order of umbel:isLike resources
and rdagr2:alternateIdentity literals in
https://data.rism.info/id/rismauthorities/pe30077074?format=rdf . To my
understanding, rdf:List would be a good choice. I followed the
recommendation at
http://patterns.dataincubator.org/book/ordered-list.html (data shortened
for brevity here): 
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
    <rdf:Description rdf:nodeID="genid5">
        <rdf:type
rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#List"/>
        <rdf:li
rdf:resource="http://digital.slub-dresden.de/id339704470"/>
        <rdf:li
rdf:resource="http://digital.slub-dresden.de/id313540659"/>
        <rdf:li
rdf:resource="http://digital.slub-dresden.de/id307035654"/>
    </rdf:Description>
    <rdf:Description
rdf:about="http://data.rism.info/id/rismauthorities/pe30077074">
        <umbel:isLike xmlns:umbel="http://umbel.org/umbel#"
rdf:nodeID="genid5"/>
    </rdf:Description>
</rdf:RDF>

Selecting this list via SimpleSelector and using Jena's RDFList fails
because no rdf:first & rdf:rest are found: 
//node is the result of the SimpleSelector
<http://data.rism.info/id/rismauthorities/pe30077074>, umbel:isLike,
null.
RDFList list = node.as(RDFList.class);
boolean valid = list.isValid(); //returns false
ExtendedIterator<RDFNode> items = list.iterator();
while (items.hasNext()) {
                Resource listItem = items.next().asResource();  //throws
an exception
...}

The only way of getting RDFList to work is adding
rdf:parseType='Collection', but then more Blank Nodes are created and
only one element is found in the iteration (and not necessarily the
first). W3C's validator says, the above rdf:List is valid RDF/XML.
Should I use another Java interface in Jena for this? Or do I have to
implement the list with more low level Jena calls avoiding the RDFList
interface? (e.g. iterating the _1, _2, _3, ...-Predicates with Selectors
myself)

Thanks & Best Regards
Andreas

P.S. My Jena dependency and version currently used: 
<dependency>
     <groupId>org.apache.jena</groupId>
     <artifactId>apache-jena</artifactId>
     <version>3.6.0</version>
     <type>pom</type>
</dependency>


Antw: Re: Representing Sequences/Lists in RDF/XML and iterating in Jena

Posted by Andreas Kahl <ka...@bsb-muenchen.de>.
Hello Dave and Christopher, 

Thank you very much for your clarifiying feedback. I had the notion of rdf:List being a Container in my mind. All this led me to a helpful StackExchange topic: 
https://stackoverflow.com/a/17589666/532272 . As I do not need finite collections, I chose rdf:Seq although there might be advantages to rdf:List (I just need control over the order of entries). The reason is mainly my environment in which I generate RDF from XML via XSLT and parse it with Jena for well formedness. So creating the rdf:List in Turtle, Ntriples or via API is not an option. Further I learned that rdf:List is a nested structure rather than a container - creating this via XSLT would be very cumbersome. In contrary, creating a Seq container and filling this with rdf:li is very convenient. 

I have one question left: What is the best practice to dynamically recognize a Seq? At the moment, I hardcoded the assumption that I got a Seq if the current node isAnon(). This works with my data, but is obviously not very generic. Any suggestions for improvements? Especially: Is there a Jena API call like isResource(), isLiteral() -> isList(), isSeq(), ...?

Best Regards
Andreas

The Java Code reading the Seq is now this: 
//Outer Method:
public Set<String> getLiteralsByPropertyURI(final String propertyUri) throws TitleDataInvalidException {
        final List<RDFNode> objectList = ...
        //Keeps insertion order
        final Set<String> output = new LinkedHashSet<>();
        if (objectList != null && !objectList.isEmpty()) {
            objectList.stream()
                    .filter(rdfNode -> null != rdfNode) //Filter out null Objects as they will throw an NPE on TreeMap.compare()
                    .forEach(rdfNode -> {
                        //The Seq starts with an anon node
                        if (rdfNode.isAnon()) {                            
                            output.addAll(LDUtilities.getSeqItemsFrom(rdfNode));
                        } else if (rdfNode.isResource()) {
                            output.add(((Resource) rdfNode).getURI());
                        } else if (rdfNode instanceof Literal) {
                            output.add(((Literal) rdfNode).getLexicalForm());
                        }
                    });
        }
        return output;
    }
//Inner Method: 
protected static List<String> getSeqItemsFrom(final RDFNode node) {
        final List<String> output = new ArrayList<>();
        //Hardcoded as Seq: 
        final Seq seq = node.as(Seq.class);
        final int x = seq.size();
        for (int i = 1; i <= seq.size(); i++) {
            final RDFNode object = seq.getObject(i);
            if (object.isURIResource()) {
                output.add(object.asResource().getURI());
            } else if (object.isLiteral()) {
                output.add(object.asLiteral().getLexicalForm());
            }
        }
        return output;
    }


>>> Christopher Johnson <ch...@gmail.com> 03.04.18 20.48 Uhr >>>
Hi,

My experience has informed me that a convenient way to create RDF Lists is
to start with JSON-LD.  Here is the example:
{
  "@context": {
    "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#"
  },
  "@graph": [
    {
      "@id": "http://data.rism.info/id/rismauthorities/pe30077074",
      "http://umbel.org/umbel#isLike": {
        "@list": [
          "http://digital.slub-dresden.de/id339704470",
          "http://digital.slub-dresden.de/id313540659",
          "http://digital.slub-dresden.de/id307035654"
        ]
      }
    }
  ]
}

then as n-triples:

<http://data.rism.info/id/rismauthorities/pe30077074> <
http://umbel.org/umbel#isLike> _:b0 .
_:b0 <http://www.w3.org/1999/02/22-rdf-syntax-ns#first> "
http://digital.slub-dresden.de/id339704470" .
_:b0 <http://www.w3.org/1999/02/22-rdf-syntax-ns#rest> _:b1 .
_:b1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#first> "
http://digital.slub-dresden.de/id313540659" .
_:b1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#rest> _:b2 .
_:b2 d307035654" .
_:b2 <http://www.w3.org/1999/02/22-rdf-syntax-ns#rest> <
http://www.w3.org/1999/02/22-rdf-syntax-ns#nil> .

then as rdfxml:

*<?xml version='1.0' encoding='utf-8' ?>
<rdf:RDF xmlns:ns0='http://umbel.org/umbel# <http://umbel.org/umbel#>'
         xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#
<http://www.w3.org/1999/02/22-rdf-syntax-ns#>'>
    <rdf:Description
rdf:about='http://data.rism.info/id/rismauthorities/pe30077074
<http://data.rism.info/id/rismauthorities/pe30077074>'>
        <ns0:isLike>
            <rdf:Description rdf:nodeID='b0'>
                <rdf:first>http://digital.slub-dresden.de/id339704470
<http://digital.slub-dresden.de/id339704470></rdf:first>
                <rdf:rest>
                    <rdf:Description rdf:nodeID='b1'>

<rdf:first>http://digital.slub-dresden.de/id313540659
<http://digital.slub-dresden.de/id313540659></rdf:first>
                        <rdf:rest>
                            <rdf:Description rdf:nodeID='b2'>

<rdf:first>http://digital.slub-dresden.de/id307035654
<http://digital.slub-dresden.de/id307035654></rdf:first>
                                <rdf:rest

rdf:resource='http://www.w3.org/1999/02/22-rdf-syntax-ns#nil'/
<http://www.w3.org/1999/02/22-rdf-syntax-ns#nil'/>>
                            </rdf:Description>
                        </rdf:rest>
                    </rdf:Description>
                </rdf:rest>
            </rdf:Description>
        </ns0:isLike>
    </rdf:Description>
</rdf:RDF>*

As Dave indicates, creating lists in rdfxml is a bit of a pain.

Christopher Johnson
Scientific Associate
Universitätsbibliothek Leipzig


On 3 April 2018 at 17:09, Dave Reynolds <da...@gmail.com> wrote:

> Hi,
>
> There seems to be a little confusion here.
>
> There are two mechanisms for representing ordered sequences of things in
> RDF. There are containers which come in three flavours one of which
> (rdf:Seq) is intended for ordered sequences:
>
> https://www.w3.org/TR/rdf-mt/#Containers
>
> There are also collections aka Lists:
>
> https://www.w3.org/TR/rdf-mt/#collections
>
> These are quite different both syntactically and semantically so you need
> to be clear on which one you are trying to use. Then the appropriate syntax
> and Jena API will follow.
>
> On 03/04/18 14:04, Andreas Kahl wrote:
>
>>   Hello everyone,
>>
>> we have the requirement to preserve the order of umbel:isLike resources
>> and rdagr2:alternateIdentity literals in
>> https://data.rism.info/id/rismauthorities/pe30077074?format=rdf . To my
>> understanding, rdf:List would be a good choice. I followed the
>> recommendation at
>> http://patterns.dataincubator.org/book/ordered-list.html (data shortened
>> for brevity here):
>> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
>>      <rdf:Description rdf:nodeID="genid5">
>>          <rdf:type
>> rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#List"/>
>>          <rdf:li
>> rdf:resource="http://digital.slub-dresden.de/id339704470"/>
>>          <rdf:li
>> rdf:resource="http://digital.slub-dresden.de/id313540659"/>
>>          <rdf:li
>> rdf:resource="http://digital.slub-dresden.de/id307035654"/>
>>      </rdf:Description>
>>      <rdf:Description
>> rdf:about="http://data.rism.info/id/rismauthorities/pe30077074">
>>          <umbel:isLike xmlns:umbel="http://umbel.org/umbel#"
>> rdf:nodeID="genid5"/>
>>      </rdf:Description>
>> </rdf:RDF>
>>
>
> This is not a list. You are using the syntax for Containers but trying to
> give it a type of rdf:List. This will not generate a legal list.
>
> Selecting this list via SimpleSelector and using Jena's RDFList fails
>> because no rdf:first & rdf:rest are found:
>>
>
> Correct, your data doesn't contain any lists.
>
> The only way of getting RDFList to work is adding
>> rdf:parseType='Collection'
>>
>
> Correct, if you are using RDF/XML that's the way to generate lists.
> However, you need to get the whole of the syntax right and you haveniteration (and not necessarily the
>> first).
>>
>
> Sounds like you haven't got the rest of the syntax right. For an example
> see:
> https://www.w3.org/TR/rdf-syntax-grammar/#section-Syntax-
> parsetype-Collection
>
> Note that this doesn't use the rdf:li syntax, that's for Containers not
> for Collections.
>
> W3C's validator says, the above rdf:List is valid RDF/XML.
>>
>
> It may be valid RDF/XML and it may mention rdf:List but it is not valid
> RDF/XML for a list :)
>
> Should I use another Java interface in Jena for this? Or do I have to
>> implement the list with more low level Jena calls avoiding the RDFList
>> interface? (e.g. iterating the _1, _2, _3, ...-Predicates with Selectors
>> myself)
>>
>
> The _1, _2 ... predicates are for Containers not Collections.
>
> I'd suggest picking which approach you want to use (ideally Lists but your
> choice), generating test cases in Turtle (*much* easier syntax when dealing
> with lists) and then use the corresponding Jena API (RDFList if you use
> lists). If you need to use RDF/XML then let the tools generate that for
> your from Turtle or from programmatic calls.
>
> Dave
>
>>
>> Thanks & Best Regards
>> Andreas
>>
>> P.S. My Jena dependency and version currently used:
>> <dependency>
>>       <groupId>org.apache.jena</groupId>
>>       <artifactId>apache-jena</artifactId>
>>       <version>3.6.0</version>
>>       <type>pom</type>
>> </dependency>
>>
>>
https://stackoverflow.com/a/17589666/532272

Re: Representing Sequences/Lists in RDF/XML and iterating in Jena

Posted by Christopher Johnson <ch...@gmail.com>.
Hi,

My experience has informed me that a convenient way to create RDF Lists is
to start with JSON-LD.  Here is the example:
{
  "@context": {
    "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#"
  },
  "@graph": [
    {
      "@id": "http://data.rism.info/id/rismauthorities/pe30077074",
      "http://umbel.org/umbel#isLike": {
        "@list": [
          "http://digital.slub-dresden.de/id339704470",
          "http://digital.slub-dresden.de/id313540659",
          "http://digital.slub-dresden.de/id307035654"
        ]
      }
    }
  ]
}

then as n-triples:

<http://data.rism.info/id/rismauthorities/pe30077074> <
http://umbel.org/umbel#isLike> _:b0 .
_:b0 <http://www.w3.org/1999/02/22-rdf-syntax-ns#first> "
http://digital.slub-dresden.de/id339704470" .
_:b0 <http://www.w3.org/1999/02/22-rdf-syntax-ns#rest> _:b1 .
_:b1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#first> "
http://digital.slub-dresden.de/id313540659" .
_:b1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#rest> _:b2 .
_:b2 <http://www.w3.org/1999/02/22-rdf-syntax-ns#first> "
http://digital.slub-dresden.de/id307035654" .
_:b2 <http://www.w3.org/1999/02/22-rdf-syntax-ns#rest> <
http://www.w3.org/1999/02/22-rdf-syntax-ns#nil> .

then as rdfxml:

*<?xml version='1.0' encoding='utf-8' ?>
<rdf:RDF xmlns:ns0='http://umbel.org/umbel# <http://umbel.org/umbel#>'
         xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#
<http://www.w3.org/1999/02/22-rdf-syntax-ns#>'>
    <rdf:Description
rdf:about='http://data.rism.info/id/rismauthorities/pe30077074
<http://data.rism.info/id/rismauthorities/pe30077074>'>
        <ns0:isLike>
            <rdf:Description rdf:nodeID='b0'>
                <rdf:first>http://digital.slub-dresden.de/id339704470
<http://digital.slub-dresden.de/id339704470></rdf:first>
                <rdf:rest>
                    <rdf:Description rdf:nodeID='b1'>

<rdf:first>http://digital.slub-dresden.de/id313540659
<http://digital.slub-dresden.de/id313540659></rdf:first>
                        <rdf:rest>
                            <rdf:Description rdf:nodeID='b2'>

<rdf:first>http://digital.slub-dresden.de/id307035654
<http://digital.slub-dresden.de/id307035654></rdf:first>
                                <rdf:rest

rdf:resource='http://www.w3.org/1999/02/22-rdf-syntax-ns#nil'/
<http://www.w3.org/1999/02/22-rdf-syntax-ns#nil'/>>
                            </rdf:Description>
                        </rdf:rest>
                    </rdf:Description>
                </rdf:rest>
            </rdf:Description>
        </ns0:isLike>
    </rdf:Description>
</rdf:RDF>*

As Dave indicates, creating lists in rdfxml is a bit of a pain.

Christopher Johnson
Scientific Associate
Universitätsbibliothek Leipzig


On 3 April 2018 at 17:09, Dave Reynolds <da...@gmail.com> wrote:

> Hi,
>
> There seems to be a little confusion here.
>
> There are two mechanisms for representing ordered sequences of things in
> RDF. There are containers which come in three flavours one of which
> (rdf:Seq) is intended for ordered sequences:
>
> https://www.w3.org/TR/rdf-mt/#Containers
>
> There are also collections aka Lists:
>
> https://www.w3.org/TR/rdf-mt/#collections
>
> These are quite different both syntactically and semantically so you need
> to be clear on which one you are trying to use. Then the appropriate syntax
> and Jena API will follow.
>
> On 03/04/18 14:04, Andreas Kahl wrote:
>
>>   Hello everyone,
>>
>> we have the requirement to preserve the order of umbel:isLike resources
>> and rdagr2:alternateIdentity literals in
>> https://data.rism.info/id/rismauthorities/pe30077074?format=rdf . To my
>> understanding, rdf:List would be a good choice. I followed the
>> recommendation at
>> http://patterns.dataincubator.org/book/ordered-list.html (data shortened
>> for brevity here):
>> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
>>      <rdf:Description rdf:nodeID="genid5">
>>          <rdf:type
>> rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#List"/>
>>          <rdf:li
>> rdf:resource="http://digital.slub-dresden.de/id339704470"/>
>>          <rdf:li
>> rdf:resource="http://digital.slub-dresden.de/id313540659"/>
>>          <rdf:li
>> rdf:resource="http://digital.slub-dresden.de/id307035654"/>
>>      </rdf:Description>
>>      <rdf:Description
>> rdf:about="http://data.rism.info/id/rismauthorities/pe30077074">
>>          <umbel:isLike xmlns:umbel="http://umbel.org/umbel#"
>> rdf:nodeID="genid5"/>
>>      </rdf:Description>
>> </rdf:RDF>
>>
>
> This is not a list. You are using the syntax for Containers but trying to
> give it a type of rdf:List. This will not generate a legal list.
>
> Selecting this list via SimpleSelector and using Jena's RDFList fails
>> because no rdf:first & rdf:rest are found:
>>
>
> Correct, your data doesn't contain any lists.
>
> The only way of getting RDFList to work is adding
>> rdf:parseType='Collection'
>>
>
> Correct, if you are using RDF/XML that's the way to generate lists.
> However, you need to get the whole of the syntax right and you haven't
> shown your modified example.
>
> but then more Blank Nodes are created and
>> only one element is found in the iteration (and not necessarily the
>> first).
>>
>
> Sounds like you haven't got the rest of the syntax right. For an example
> see:
> https://www.w3.org/TR/rdf-syntax-grammar/#section-Syntax-
> parsetype-Collection
>
> Note that this doesn't use the rdf:li syntax, that's for Containers not
> for Collections.
>
> W3C's validator says, the above rdf:List is valid RDF/XML.
>>
>
> It may be valid RDF/XML and it may mention rdf:List but it is not valid
> RDF/XML for a list :)
>
> Should I use another Java interface in Jena for this? Or do I have to
>> implement the list with more low level Jena calls avoiding the RDFList
>> interface? (e.g. iterating the _1, _2, _3, ...-Predicates with Selectors
>> myself)
>>
>
> The _1, _2 ... predicates are for Containers not Collections.
>
> I'd suggest picking which approach you want to use (ideally Lists but your
> choice), generating test cases in Turtle (*much* easier syntax when dealing
> with lists) and then use the corresponding Jena API (RDFList if you use
> lists). If you need to use RDF/XML then let the tools generate that for
> your from Turtle or from programmatic calls.
>
> Dave
>
>>
>> Thanks & Best Regards
>> Andreas
>>
>> P.S. My Jena dependency and version currently used:
>> <dependency>
>>       <groupId>org.apache.jena</groupId>
>>       <artifactId>apache-jena</artifactId>
>>       <version>3.6.0</version>
>>       <type>pom</type>
>> </dependency>
>>
>>

Re: Representing Sequences/Lists in RDF/XML and iterating in Jena

Posted by Dave Reynolds <da...@gmail.com>.
Hi,

There seems to be a little confusion here.

There are two mechanisms for representing ordered sequences of things in 
RDF. There are containers which come in three flavours one of which 
(rdf:Seq) is intended for ordered sequences:

https://www.w3.org/TR/rdf-mt/#Containers

There are also collections aka Lists:

https://www.w3.org/TR/rdf-mt/#collections

These are quite different both syntactically and semantically so you 
need to be clear on which one you are trying to use. Then the 
appropriate syntax and Jena API will follow.

On 03/04/18 14:04, Andreas Kahl wrote:
>   Hello everyone,
> 
> we have the requirement to preserve the order of umbel:isLike resources
> and rdagr2:alternateIdentity literals in
> https://data.rism.info/id/rismauthorities/pe30077074?format=rdf . To my
> understanding, rdf:List would be a good choice. I followed the
> recommendation at
> http://patterns.dataincubator.org/book/ordered-list.html (data shortened
> for brevity here):
> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
>      <rdf:Description rdf:nodeID="genid5">
>          <rdf:type
> rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#List"/>
>          <rdf:li
> rdf:resource="http://digital.slub-dresden.de/id339704470"/>
>          <rdf:li
> rdf:resource="http://digital.slub-dresden.de/id313540659"/>
>          <rdf:li
> rdf:resource="http://digital.slub-dresden.de/id307035654"/>
>      </rdf:Description>
>      <rdf:Description
> rdf:about="http://data.rism.info/id/rismauthorities/pe30077074">
>          <umbel:isLike xmlns:umbel="http://umbel.org/umbel#"
> rdf:nodeID="genid5"/>
>      </rdf:Description>
> </rdf:RDF>

This is not a list. You are using the syntax for Containers but trying 
to give it a type of rdf:List. This will not generate a legal list.

> Selecting this list via SimpleSelector and using Jena's RDFList fails
> because no rdf:first & rdf:rest are found:

Correct, your data doesn't contain any lists.

> The only way of getting RDFList to work is adding
> rdf:parseType='Collection'

Correct, if you are using RDF/XML that's the way to generate lists. 
However, you need to get the whole of the syntax right and you haven't 
shown your modified example.

> but then more Blank Nodes are created and
> only one element is found in the iteration (and not necessarily the
> first). 

Sounds like you haven't got the rest of the syntax right. For an example 
see:
https://www.w3.org/TR/rdf-syntax-grammar/#section-Syntax-parsetype-Collection

Note that this doesn't use the rdf:li syntax, that's for Containers not 
for Collections.

> W3C's validator says, the above rdf:List is valid RDF/XML.

It may be valid RDF/XML and it may mention rdf:List but it is not valid 
RDF/XML for a list :)

> Should I use another Java interface in Jena for this? Or do I have to
> implement the list with more low level Jena calls avoiding the RDFList
> interface? (e.g. iterating the _1, _2, _3, ...-Predicates with Selectors
> myself)

The _1, _2 ... predicates are for Containers not Collections.

I'd suggest picking which approach you want to use (ideally Lists but 
your choice), generating test cases in Turtle (*much* easier syntax when 
dealing with lists) and then use the corresponding Jena API (RDFList if 
you use lists). If you need to use RDF/XML then let the tools generate 
that for your from Turtle or from programmatic calls.

Dave
> 
> Thanks & Best Regards
> Andreas
> 
> P.S. My Jena dependency and version currently used:
> <dependency>
>       <groupId>org.apache.jena</groupId>
>       <artifactId>apache-jena</artifactId>
>       <version>3.6.0</version>
>       <type>pom</type>
> </dependency>
>