You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jena.apache.org by "Shaw, Ryan" <ry...@unc.edu> on 2023/06/19 23:35:07 UTC

Customizing Turtle pretty printing

I would like to create a custom Turtle pretty printer that does not try to align things like the current pretty printer does. I just want a standard indentation width (e.g. two spaces).

So, instead of:

```
[ a                            time:ProperInterval ;
  time:hasBeginning            [ a                time:Instant ;
                                 time:inDateTime  [ a              time:DateTimeDescription ;
                                                    time:day       "---12"^^xsd:gDay ;
                                                    time:hour      "23"^^xsd:nonNegativeInteger ;
                                                    time:minute    "20"^^xsd:nonNegativeInteger ;
                                                    time:month     "--04"^^xsd:gMonth ;
                                                    time:second    "30"^^xsd:decimal ;
                                                    time:timeZone  bipm:UTC ;
                                                    time:unitType  time:unitSecond ;
                                                    time:year      "1985"^^xsd:gYear
                                                  ]
                               ] ;
  time:hasDurationDescription  [ a           time:DurationDescription ;
                                 time:hours  "04"^^xsd:nonNegativeInteger
                               ] ;
  time:hasEnd                  :when2
] .
```

I would instead like:

```
[
  a time:ProperInterval ;
  time:hasBeginning :when1 ;
  time:hasDurationDescription [
    a time:DurationDescription ;
    time:hours  "04"^^xsd:nonNegativeInteger
  ] ;
  time:hasEnd [
    a time:Instant ;
    time:inDateTime [
      a time:DateTimeDescription ;
      time:day "---12"^^xsd:gDay ;
      time:hour "23"^^xsd:nonNegativeInteger ;
      time:minute "20"^^xsd:nonNegativeInteger ;
      time:month "--04"^^xsd:gMonth ;
      time:second "30"^^xsd:decimal ;
      time:timeZone bipm:UTC ;
      time:unitType time:unitSecond ;
      time:year "1985"^^xsd:gYear
    ]
  ]
] .
```

(Note how much easier the 2nd is to read without a super-wide window!)

I know that I can add a new writer for a new language, but how do I add a new pretty-printed format for an existing language?

Thanks,
Ryan

Re: Customizing Turtle pretty printing

Posted by Nicholas Car <ni...@kurrawong.net>.
longturtle doesn't handle RDF-Star because RDFLib itself doesn't yet.

A student of mine has actually added RDF-Start support to RDFLib and longturtle could be used for output but the PR's not accepted yet (https://github.com/RDFLib/rdflib/pull/2115).

As a format, longturtle is just turtle with a few more linebreaks and outdenting for nested objects. It doesn't do anything fancy regarding Blank Node deterministic serialisation but it's good for things like Git diffing, most of the time! I use it for 100s of SKOS vocabs that are all stored in Git for management and  thenloaded into Fuseki for use.

Nick


------- Original Message -------
On Tuesday, June 20th, 2023 at 23:13, Ryan Shaw <ri...@icloud.com.INVALID> wrote:


> > On Jun 19, 2023, at 9:37 PM, Nicholas Car nick@kurrawong.net wrote:
> > 
> > It's not a direct solution, but please have a look at the "LongTurtle" format used in Python's RDFLib library to see if that style of formatting's good/bad etc.
> 
> 
> 
> Yes, I like the LongTurtle format.
> 
> Up to now I’ve been piping arq output into `rdfpipe -o longturtle`. But rdflib serializers don’t handle RDF-star (maybe this has changed recently?), so I was hoping to get Jena to produce longturtlish formatted RDF that also handles RDF-star quoted triples.
> 
> Andy, thanks for the pointers to the code in TurtleShell.
> 
> Cheers,
> Ryan

Re: Customizing Turtle pretty printing

Posted by Ryan Shaw <ri...@icloud.com.INVALID>.
> On Jun 19, 2023, at 9:37 PM, Nicholas Car <ni...@kurrawong.net> wrote:
> 
> It's not a direct solution, but please have a look at the "LongTurtle" format used in Python's RDFLib library to see if that style of formatting's good/bad etc.


Yes, I like the LongTurtle format. 

Up to now I’ve been piping arq output into `rdfpipe -o longturtle`. But rdflib serializers don’t handle RDF-star (maybe this has changed recently?), so I was hoping to get Jena to produce longturtlish formatted RDF that also handles RDF-star quoted triples.

Andy, thanks for the pointers to the code in TurtleShell.

Cheers,
Ryan

Re: Customizing Turtle pretty printing

Posted by Nicholas Car <ni...@kurrawong.net>.
It's not a direct solution, but please have a look at the "LongTurtle" format used in Python's RDFLib library to see if that style of formatting's good/bad etc. This conversion tool allows outputting in LongTurtle:

http://rdftools.kurrawong.net/convert

I'm a maintainer of RDFLib and would be keen to test out other Turtle formatting there to check equivalences with Jena. Always necissary to have 2+ implementations of things!

Cheers, Nick



------- Original Message -------
On Tuesday, June 20th, 2023 at 09:35, Shaw, Ryan <ry...@unc.edu> wrote:


> I would like to create a custom Turtle pretty printer that does not try to align things like the current pretty printer does. I just want a standard indentation width (e.g. two spaces).
> 
> So, instead of:
> 
> `[ a time:ProperInterval ; time:hasBeginning [ a time:Instant ; time:inDateTime [ a time:DateTimeDescription ; time:day "---12"^^xsd:gDay ; time:hour "23"^^xsd:nonNegativeInteger ; time:minute "20"^^xsd:nonNegativeInteger ; time:month "--04"^^xsd:gMonth ; time:second "30"^^xsd:decimal ; time:timeZone bipm:UTC ; time:unitType time:unitSecond ; time:year "1985"^^xsd:gYear ] ] ; time:hasDurationDescription [ a time:DurationDescription ; time:hours "04"^^xsd:nonNegativeInteger ] ; time:hasEnd :when2 ] .`
> 
> I would instead like:
> 
> `[ a time:ProperInterval ; time:hasBeginning :when1 ; time:hasDurationDescription [ a time:DurationDescription ; time:hours "04"^^xsd:nonNegativeInteger ] ; time:hasEnd [ a time:Instant ; time:inDateTime [ a time:DateTimeDescription ; time:day "---12"^^xsd:gDay ; time:hour "23"^^xsd:nonNegativeInteger ; time:minute "20"^^xsd:nonNegativeInteger ; time:month "--04"^^xsd:gMonth ; time:second "30"^^xsd:decimal ; time:timeZone bipm:UTC ; time:unitType time:unitSecond ; time:year "1985"^^xsd:gYear ] ] ] .`
> 
> (Note how much easier the 2nd is to read without a super-wide window!)
> 
> I know that I can add a new writer for a new language, but how do I add a new pretty-printed format for an existing language?
> 
> Thanks,
> Ryan

Re: Customizing Turtle pretty printing

Posted by Nicholas Car <ni...@kurrawong.net>.
Hi Ryan,

I have only the single static test in the RDFLib test suite for comparison, but it caters for most cases, I think! Yes, I admit it could be more methodical but I've just built on top of RDFLib's regular Turtle serializer so far, so there's little criticality in testing for everything as the worst case outcome is just regular turtle.

Here is the test file: 

https://github.com/RDFLib/rdflib/blob/main/test/test_serializers/test_serializer_longturtle.py

You can see it's taking in JSON-LD data, to ensure no Turtle formatting is preserved, then comparing a serialization to a fixed string. Easy to add more data to cover more cases within the same test.

I'm on leave at the moment but could look in to more issues here in a couple of weeks if needed.

Cheers, Nick


------- Original Message -------
On Saturday, July 29th, 2023 at 23:43, Shaw, Ryan <ry...@unc.edu> wrote:


> Nick,
> 
> Do you have a set of test cases that you’ve been working with as you develop the longturtle format?
> 
> As I worked on the Jena implementation I was just testing in an ad hoc way, and I’ve since noticed that I missed some edge cases. It would be nice to be able to confirm that the RDFLib and Jena output are as close as possible.
> 
> Thanks,
> Ryan
> 
> > On Jul 25, 2023, at 5:00 PM, Nicholas Car nick@kurrawong.net wrote:
> > 
> > Hi all,
> > 
> > Please note that I've just today improved the longturtle format as it is implemented in RDFLib. The implementation now better aligns Blank Nodes, Collections and fixes a small semi-bug (two trailing blank lines).
> > 
> > Jena users might just like to compare Jena's longturtle to the longturtle now implemented in RDFLib master branch.
> > 
> > I think that longturtle will be the default turtle format in RDFLib 7.x due out soon (couple of months) as it's better fro Git and better for similarity with SPARQL.
> > 
> > Cheers, Nick
> > 
> > ------- Original Message -------
> > On Friday, June 23rd, 2023 at 22:40, Andy Seaborne andy@apache.org wrote:
> > 
> > > On 23/06/2023 15:02, Shaw, Ryan wrote:
> > > 
> > > > > On Jun 22, 2023, at 5:32 PM, Andy Seaborne andy@apache.org wrote:
> > > > > 
> > > > > is it doing lists?
> > > > 
> > > > Yes, with a newline for every list item as in rdflib’s longturtle.
> > > > 
> > > > > More generally -
> > > > > We have Jena 5.x coming up (at a minimum - require Java17)
> > > > > 
> > > > > Should this become the default pretty layout? or some variant?
> > > > 
> > > > I’m agnostic about making it the default. Maybe it should just be an option initially?
> > > 
> > > Yes.
> > > 
> > > I hope we can get this into Jena 4.9.0 accessible as a RDFFormat which
> > > is "soon".
> > > 
> > > Andy
> > > 
> > > > > > How should I register a new RDFFormat (e.g. TURTLE_LONG)?
> > > > > > Yes.
> > > > 
> > > > 👍

Re: Customizing Turtle pretty printing

Posted by "Shaw, Ryan" <ry...@unc.edu>.
Nick,

Do you have a set of test cases that you’ve been working with as you develop the longturtle format? 

As I worked on the Jena implementation I was just testing in an ad hoc way, and I’ve since noticed that I missed some edge cases. It would be nice to be able to confirm that the RDFLib and Jena output are as close as possible.

Thanks,
Ryan

> On Jul 25, 2023, at 5:00 PM, Nicholas Car <ni...@kurrawong.net> wrote:
> 
> Hi all,
> 
> Please note that I've just today improved the longturtle format as it is implemented in RDFLib. The implementation now better aligns Blank Nodes, Collections and fixes a small semi-bug (two trailing blank lines). 
> 
> Jena users might just like to compare Jena's longturtle to the longturtle now implemented in RDFLib master branch.
> 
> I think that longturtle will be the default turtle format in RDFLib 7.x due out soon (couple of months) as it's better fro Git and better for similarity with SPARQL.
> 
> Cheers, Nick
> 
> 
> ------- Original Message -------
> On Friday, June 23rd, 2023 at 22:40, Andy Seaborne <an...@apache.org> wrote:
> 
> 
>> 
>> On 23/06/2023 15:02, Shaw, Ryan wrote:
>> 
>>>> On Jun 22, 2023, at 5:32 PM, Andy Seaborne andy@apache.org wrote:
>>>> 
>>>> is it doing lists?
>>> 
>>> Yes, with a newline for every list item as in rdflib’s longturtle.
>>> 
>>>> More generally -
>>>> We have Jena 5.x coming up (at a minimum - require Java17)
>>>> 
>>>> Should this become the default pretty layout? or some variant?
>>> 
>>> I’m agnostic about making it the default. Maybe it should just be an option initially?
>> 
>> 
>> Yes.
>> 
>> I hope we can get this into Jena 4.9.0 accessible as a RDFFormat which
>> is "soon".
>> 
>> Andy
>> 
>>>>> How should I register a new RDFFormat (e.g. TURTLE_LONG)?
>>>>> Yes.
>>> 
>>> 👍


Re: Customizing Turtle pretty printing

Posted by Nicholas Car <ni...@kurrawong.net>.
Hi all,

Please note that I've just today improved the longturtle format as it is implemented in RDFLib. The implementation now better aligns Blank Nodes, Collections and fixes a small semi-bug (two trailing blank lines). 

Jena users might just like to compare Jena's longturtle to the longturtle now implemented in RDFLib master branch.

I think that longturtle will be the default turtle format in RDFLib 7.x due out soon (couple of months) as it's better fro Git and better for similarity with SPARQL.

Cheers, Nick


------- Original Message -------
On Friday, June 23rd, 2023 at 22:40, Andy Seaborne <an...@apache.org> wrote:


> 
> On 23/06/2023 15:02, Shaw, Ryan wrote:
> 
> > > On Jun 22, 2023, at 5:32 PM, Andy Seaborne andy@apache.org wrote:
> > > 
> > > is it doing lists?
> > 
> > Yes, with a newline for every list item as in rdflib’s longturtle.
> > 
> > > More generally -
> > > We have Jena 5.x coming up (at a minimum - require Java17)
> > > 
> > > Should this become the default pretty layout? or some variant?
> > 
> > I’m agnostic about making it the default. Maybe it should just be an option initially?
> 
> 
> Yes.
> 
> I hope we can get this into Jena 4.9.0 accessible as a RDFFormat which
> is "soon".
> 
> Andy
> 
> > > > How should I register a new RDFFormat (e.g. TURTLE_LONG)?
> > > > Yes.
> > 
> > 👍

Re: Customizing Turtle pretty printing

Posted by Andy Seaborne <an...@apache.org>.

On 23/06/2023 15:02, Shaw, Ryan wrote:
> 
> 
>> On Jun 22, 2023, at 5:32 PM, Andy Seaborne <an...@apache.org> wrote:
>>
>> is it doing lists?
> 
> Yes, with a newline for every list item as in rdflib’s longturtle.
> 
>> More generally -
>> We have Jena 5.x coming up (at a minimum - require Java17)
>>
>> Should this become the default pretty layout? or some variant?
> 
> I’m agnostic about making it the default. Maybe it should just be an option initially?

Yes.

I hope we can get this into Jena 4.9.0 accessible as a RDFFormat which 
is "soon".

     Andy

> 
>>> How should I register a new RDFFormat (e.g. TURTLE_LONG)?
>> Yes.
> 
> 👍

Re: Customizing Turtle pretty printing

Posted by "Shaw, Ryan" <ry...@unc.edu>.
PR to add a "long" Turtle format variant:

https://github.com/apache/jena/pull/1923

Usage:

riot --set ttl:indentStyle=long


Re: Customizing Turtle pretty printing

Posted by "Shaw, Ryan" <ry...@unc.edu>.

> On Jun 22, 2023, at 5:32 PM, Andy Seaborne <an...@apache.org> wrote:
> 
> is it doing lists?

Yes, with a newline for every list item as in rdflib’s longturtle.

> More generally -
> We have Jena 5.x coming up (at a minimum - require Java17)
> 
> Should this become the default pretty layout? or some variant?

I’m agnostic about making it the default. Maybe it should just be an option initially? 

>> How should I register a new RDFFormat (e.g. TURTLE_LONG)?
> Yes.

👍

Re: Customizing Turtle pretty printing

Posted by Andy Seaborne <an...@apache.org>.

On 21/06/2023 18:25, Shaw, Ryan wrote:
> I have managed to modify TurtleShell.ShellGraph to output formatted TTL that is very close to rdflib’s longturtle output.
> 
> The modified functions are:
> 
> writeRemainingNLinkedLists
> writeCluster
> writePredicateObjectList (x2)
> writePredicateObject
> writePredicate
> writeNestedObjectTopLevel
> writeNestedObject
> writeList
> write_S_P_Gap
> 
> I could use some advice on how best to integrate this. 

> Should I simply subclass TurtleShell.ShellGraph?
If you think that it is modifying the same algorithm then yes. if it 
doing lists?

More generally -
We have Jena 5.x coming up (at a minimum - require Java17)

Should this become the default pretty layout? or some variant?
The current pretty form does tend to march off the right of the window.

> How should I register a new RDFFormat (e.g. TURTLE_LONG)?
Yes.

> 
> Ryan

     Andy

Re: Customizing Turtle pretty printing

Posted by "Shaw, Ryan" <ry...@unc.edu>.
I have managed to modify TurtleShell.ShellGraph to output formatted TTL that is very close to rdflib’s longturtle output.

The modified functions are:

writeRemainingNLinkedLists
writeCluster
writePredicateObjectList (x2)
writePredicateObject
writePredicate
writeNestedObjectTopLevel
writeNestedObject
writeList
write_S_P_Gap

I could use some advice on how best to integrate this. Should I simply subclass TurtleShell.ShellGraph? How should I register a new RDFFormat (e.g. TURTLE_LONG)?

Ryan

Re: Customizing Turtle pretty printing

Posted by Andy Seaborne <an...@apache.org>.
I think it would be good to have such a writer.

 > I know that I can add a new writer for a new language, but how do I 
add a new pretty-printed format for an existing language?

At the moment there isn't a simple extension mechanism to alter the layout.

The code is in class TurtleShell.ShellGraph. There is a lot of 
preprocessing to identity the items form that be printed as [] and (). 
The perprocessing could be abstracted,

The layout is done at

   TurtleShell.writeCluster

mostly done within:

   TurtleShell.writeClusterPredicateObjectList

     Andy

On 20/06/2023 00:35, Shaw, Ryan wrote:
> I would like to create a custom Turtle pretty printer that does not try to align things like the current pretty printer does. I just want a standard indentation width (e.g. two spaces).
> 
> So, instead of:
> 
> ```
> [ a                            time:ProperInterval ;
>    time:hasBeginning            [ a                time:Instant ;
>                                   time:inDateTime  [ a              time:DateTimeDescription ;
>                                                      time:day       "---12"^^xsd:gDay ;
>                                                      time:hour      "23"^^xsd:nonNegativeInteger ;
>                                                      time:minute    "20"^^xsd:nonNegativeInteger ;
>                                                      time:month     "--04"^^xsd:gMonth ;
>                                                      time:second    "30"^^xsd:decimal ;
>                                                      time:timeZone  bipm:UTC ;
>                                                      time:unitType  time:unitSecond ;
>                                                      time:year      "1985"^^xsd:gYear
>                                                    ]
>                                 ] ;
>    time:hasDurationDescription  [ a           time:DurationDescription ;
>                                   time:hours  "04"^^xsd:nonNegativeInteger
>                                 ] ;
>    time:hasEnd                  :when2
> ] .
> ```
> 
> I would instead like:
> 
> ```
> [
>    a time:ProperInterval ;
>    time:hasBeginning :when1 ;
>    time:hasDurationDescription [
>      a time:DurationDescription ;
>      time:hours  "04"^^xsd:nonNegativeInteger
>    ] ;
>    time:hasEnd [
>      a time:Instant ;
>      time:inDateTime [
>        a time:DateTimeDescription ;
>        time:day "---12"^^xsd:gDay ;
>        time:hour "23"^^xsd:nonNegativeInteger ;
>        time:minute "20"^^xsd:nonNegativeInteger ;
>        time:month "--04"^^xsd:gMonth ;
>        time:second "30"^^xsd:decimal ;
>        time:timeZone bipm:UTC ;
>        time:unitType time:unitSecond ;
>        time:year "1985"^^xsd:gYear
>      ]
>    ]
> ] .
> ```
> 
> (Note how much easier the 2nd is to read without a super-wide window!)
> 
> I know that I can add a new writer for a new language, but how do I add a new pretty-printed format for an existing language?
> 
> Thanks,
> Ryan

Re: Customizing Turtle pretty printing

Posted by Simon Bin <sb...@informatik.uni-leipzig.de>.
Maybe you can take some inspiration from
https://github.com/buda-base/jena-stable-turtle/ 

On Mon, 2023-06-19 at 23:35 +0000, Shaw, Ryan wrote:
> I would like to create a custom Turtle pretty printer that does not
> try to align things like the current pretty printer does. I just want
> a standard indentation width (e.g. two spaces).
> 
> So, instead of:
> 
> ```
> [ a                            time:ProperInterval ;
>   time:hasBeginning            [ a                time:Instant ;
>                                  time:inDateTime  [ a             
> time:DateTimeDescription ;
>                                                     time:day       "-
> --12"^^xsd:gDay ;
>                                                     time:hour     
> "23"^^xsd:nonNegativeInteger ;
>                                                     time:minute   
> "20"^^xsd:nonNegativeInteger ;
>                                                     time:month     "-
> -04"^^xsd:gMonth ;
>                                                     time:second   
> "30"^^xsd:decimal ;
>                                                     time:timeZone 
> bipm:UTC ;
>                                                     time:unitType 
> time:unitSecond ;
>                                                     time:year     
> "1985"^^xsd:gYear
>                                                   ]
>                                ] ;
>   time:hasDurationDescription  [ a           time:DurationDescription
> ;
>                                  time:hours 
> "04"^^xsd:nonNegativeInteger
>                                ] ;
>   time:hasEnd                  :when2
> ] .
> ```
> 
> I would instead like:
> 
> ```
> [
>   a time:ProperInterval ;
>   time:hasBeginning :when1 ;
>   time:hasDurationDescription [
>     a time:DurationDescription ;
>     time:hours  "04"^^xsd:nonNegativeInteger
>   ] ;
>   time:hasEnd [
>     a time:Instant ;
>     time:inDateTime [
>       a time:DateTimeDescription ;
>       time:day "---12"^^xsd:gDay ;
>       time:hour "23"^^xsd:nonNegativeInteger ;
>       time:minute "20"^^xsd:nonNegativeInteger ;
>       time:month "--04"^^xsd:gMonth ;
>       time:second "30"^^xsd:decimal ;
>       time:timeZone bipm:UTC ;
>       time:unitType time:unitSecond ;
>       time:year "1985"^^xsd:gYear
>     ]
>   ]
> ] .
> ```
> 
> (Note how much easier the 2nd is to read without a super-wide
> window!)
> 
> I know that I can add a new writer for a new language, but how do I
> add a new pretty-printed format for an existing language?
> 
> Thanks,
> Ryan
>