You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jena.apache.org by Andy Seaborne <an...@apache.org> on 2013/09/22 18:28:35 UTC

Re: apache-jena-2.11.0 tdbloader2 RIOT Reader Execption with RDF/XML files [.owl]

On 22/09/13 15:41, Marco Neumann wrote:
> had to rename the static *.owl files to *.rdf and now the bulk load
> works just fine with tdbloader2 (apache-jena-2.11.0)
>
>
>
> On Sun, Sep 22, 2013 at 10:08 AM, Marco Neumann <ma...@gmail.com> wrote:
>> get the following tdbloader2 exception now with all my RDF/XML files
>> on apache-jena-2.11.0
>>
>> INFO  Load: lotico.owl -- 2013/09/22 13:59:02 UTC
>> ERROR [line: 3, col: 1 ] Broken IRI (newline): rdf:RDF
>> xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>> org.apache.jena.riot.RiotException: [line: 3, col: 1 ] Broken IRI
>> (newline): rdf:RDF
>> xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>>          at org.apache.jena.riot.system.ErrorHandlerFactory$ErrorHandlerStd.fatal(ErrorHandlerFactory.java:136)
>>          at org.apache.jena.riot.lang.LangEngine.raiseException(LangEngine.java:163)
>>          at org.apache.jena.riot.lang.LangEngine.nextToken(LangEngine.java:106)
>>          at org.apache.jena.riot.lang.LangNTriples.parseOne(LangNTriples.java:63)
>>          at org.apache.jena.riot.lang.LangNTriples.runParser(LangNTriples.java:54)
>>          at org.apache.jena.riot.lang.LangBase.parse(LangBase.java:42)
>>          at org.apache.jena.riot.RiotReader.parse(RiotReader.java:116)
>>          at org.apache.jena.riot.RiotReader.parse(RiotReader.java:93)
>>          at org.apache.jena.riot.RiotReader.parse(RiotReader.java:66)
>>          at com.hp.hpl.jena.tdb.store.bulkloader2.CmdNodeTableBuilder.exec(CmdNodeTableBuilder.java:163)
>>          at arq.cmdline.CmdMain.mainMethod(CmdMain.java:101)
>>          at arq.cmdline.CmdMain.mainRun(CmdMain.java:63)
>>          at arq.cmdline.CmdMain.mainRun(CmdMain.java:50)
>>          at com.hp.hpl.jena.tdb.store.bulkloader2.CmdNodeTableBuilder.main(CmdNodeTableBuilder.java:81)
>>
>> --
>>
>>
>> ---
>> Marco Neumann
>> KONA
>
>
>

I'm happy to go with whatever is the community decision here - should 
".owl" default to assuming RDF/XML?

There is no formal definition of ".owl". It is mentioned in OWL1 but not 
in OWL2.

This came about from

http://mail-archives.apache.org/mod_mbox/jena-users/201308.mbox/%3C52169A5B.4060902%40knublauch.com%3E

but reviewing that it could be the initial analysis was wrong as the 
report was expanded upon over several rounds of email.

Adding .owl in as a default extension to mean RDF/XML does not break any 
of the tests.

	Andy


Re: Registering .owl as RDF/XML

Posted by Andy Seaborne <an...@apache.org>.
See JENA-548
https://issues.apache.org/jira/browse/JENA-548

Registering .owl as RDF/XML

Posted by Andy Seaborne <an...@apache.org>.
I wrote a test program (at end) to investigate the difference when .owl 
is registered as RDF/XML format and it is not.

The difference turns the case where the system has no idea of the 
language into trying RDF/XML.

When the app provides a suggested language, this is used unless the far 
end gives a non-text/plain content type. "text/plain" is too unreliable 
to assume anything as it's what you get if you do nothing.  RIOT does 
not assume ISO-8859-1 in this case either.

In both cases, the server declared type is used.  This covers the case 
where the publisher changes the content type of a URL wihtou braking the 
application that used to work.

	Andy

== .owl not registered

URL=http://example/file.owl
Content-type=null
Lang=null
   ==> null

URL=http://example/file.owl
Content-type=null
Lang=Lang:Turtle
   ==> Lang:Turtle

URL=http://example/file.owl
Content-type=text/plain
Lang=null
   ==> null

URL=http://example/file.owl
Content-type=text/plain
Lang=Lang:Turtle
   ==> Lang:Turtle

URL=http://example/file.owl
Content-type=application/rdf+json
Lang=null
   ==> Lang:RDF/JSON

URL=http://example/file.owl
Content-type=application/rdf+json
Lang=Lang:Turtle
   ==> Lang:RDF/JSON



== With .owl registered

URL=http://example/file.owl
Content-type=null
Lang=null
   ==> Lang:RDF/XML

URL=http://example/file.owl
Content-type=null
Lang=Lang:Turtle
   ==> Lang:Turtle

URL=http://example/file.owl
Content-type=text/plain
Lang=null
   ==> Lang:RDF/XML

URL=http://example/file.owl
Content-type=text/plain
Lang=Lang:Turtle
   ==> Lang:Turtle

URL=http://example/file.owl
Content-type=application/rdf+json
Lang=null
   ==> Lang:RDF/JSON

URL=http://example/file.owl
Content-type=application/rdf+json
Lang=Lang:Turtle
   ==> Lang:RDF/JSON

public static void main(String ... argv) throws Exception
{
     test("http://example/file.owl",
          null, null) ;
     test("http://example/file.owl",
          null, Lang.TTL) ;

     test("http://example/file.owl",
          WebContent.contentTypeTextPlain, null) ;
     test("http://example/file.owl",
          WebContent.contentTypeTextPlain, Lang.TTL) ;

     test("http://example/file.owl",
          WebContent.contentTypeRDFJSON, null) ;
     test("http://example/file.owl",
          WebContent.contentTypeRDFJSON, Lang.TTL) ;
}

public static void test(String url, String ctStr, Lang lang) {
     System.out.printf(
         "URL=%s\nContent-type=%s\nLang=%s\n", url, ctStr, lang) ;
     Lang lang2 = RDFDataMgr.determineLang(url, ctStr, lang) ;
     System.out.println("  ==> "+lang2) ;
     System.out.println() ;
     }



Re: apache-jena-2.11.0 tdbloader2 RIOT Reader Execption with RDF/XML files [.owl]

Posted by Andy Seaborne <an...@apache.org>.
>> I'm happy to go with whatever is the community decision here - should ".owl"
>> default to assuming RDF/XML?
>
> neither am I, maybe time to phase out the owl extension. in what
> serialization does jena read the owl file in my example?

N-Quads - the default for tdbloader

(think pipeline
    gzip -d < FILE.nq.gz | tdbloader -- -
)

>
>> There is no formal definition of ".owl". It is mentioned in OWL1 but not in
>> OWL2.
>
> if it is used I think we should assume a RDF/XML serialization
>
>> This came about from
>>
>> http://mail-archives.apache.org/mod_mbox/jena-users/201308.mbox/%3C52169A5B.4060902%40knublauch.com%3E
>>
>
> I share Holgers concerns here but would not go along with the
> assumption to find a ttl serialization in an .owl file.
>
>> but reviewing that it could be the initial analysis was wrong as the report
>> was expanded upon over several rounds of email.
>>
>> Adding .owl in as a default extension to mean RDF/XML does not break any of
>> the tests.
>>
>>          Andy
>>
>
> this might be a good solution until the community has come to some
> consensus on file extensions.
>
> let me say thank you to the entire team for the new release. well done!

On behalf of everyone, thank you.

>
>

	Andy


Re: apache-jena-2.11.0 tdbloader2 RIOT Reader Execption with RDF/XML files [.owl]

Posted by Marco Neumann <ma...@gmail.com>.
On Sun, Sep 22, 2013 at 12:28 PM, Andy Seaborne <an...@apache.org> wrote:
> On 22/09/13 15:41, Marco Neumann wrote:
>>
>> had to rename the static *.owl files to *.rdf and now the bulk load
>> works just fine with tdbloader2 (apache-jena-2.11.0)
>>
>>
>>
>> On Sun, Sep 22, 2013 at 10:08 AM, Marco Neumann <ma...@gmail.com>
>> wrote:
>>>
>>> get the following tdbloader2 exception now with all my RDF/XML files
>>> on apache-jena-2.11.0
>>>
>>> INFO  Load: lotico.owl -- 2013/09/22 13:59:02 UTC
>>> ERROR [line: 3, col: 1 ] Broken IRI (newline): rdf:RDF
>>> xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>>> org.apache.jena.riot.RiotException: [line: 3, col: 1 ] Broken IRI
>>> (newline): rdf:RDF
>>> xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>>>          at
>>> org.apache.jena.riot.system.ErrorHandlerFactory$ErrorHandlerStd.fatal(ErrorHandlerFactory.java:136)
>>>          at
>>> org.apache.jena.riot.lang.LangEngine.raiseException(LangEngine.java:163)
>>>          at
>>> org.apache.jena.riot.lang.LangEngine.nextToken(LangEngine.java:106)
>>>          at
>>> org.apache.jena.riot.lang.LangNTriples.parseOne(LangNTriples.java:63)
>>>          at
>>> org.apache.jena.riot.lang.LangNTriples.runParser(LangNTriples.java:54)
>>>          at org.apache.jena.riot.lang.LangBase.parse(LangBase.java:42)
>>>          at org.apache.jena.riot.RiotReader.parse(RiotReader.java:116)
>>>          at org.apache.jena.riot.RiotReader.parse(RiotReader.java:93)
>>>          at org.apache.jena.riot.RiotReader.parse(RiotReader.java:66)
>>>          at
>>> com.hp.hpl.jena.tdb.store.bulkloader2.CmdNodeTableBuilder.exec(CmdNodeTableBuilder.java:163)
>>>          at arq.cmdline.CmdMain.mainMethod(CmdMain.java:101)
>>>          at arq.cmdline.CmdMain.mainRun(CmdMain.java:63)
>>>          at arq.cmdline.CmdMain.mainRun(CmdMain.java:50)
>>>          at
>>> com.hp.hpl.jena.tdb.store.bulkloader2.CmdNodeTableBuilder.main(CmdNodeTableBuilder.java:81)
>>>
>>> --
>>>
>>>
>>> ---
>>> Marco Neumann
>>> KONA
>>
>>
>>
>>
>
> I'm happy to go with whatever is the community decision here - should ".owl"
> default to assuming RDF/XML?

neither am I, maybe time to phase out the owl extension. in what
serialization does jena read the owl file in my example?

> There is no formal definition of ".owl". It is mentioned in OWL1 but not in
> OWL2.

if it is used I think we should assume a RDF/XML serialization

> This came about from
>
> http://mail-archives.apache.org/mod_mbox/jena-users/201308.mbox/%3C52169A5B.4060902%40knublauch.com%3E
>

I share Holgers concerns here but would not go along with the
assumption to find a ttl serialization in an .owl file.

> but reviewing that it could be the initial analysis was wrong as the report
> was expanded upon over several rounds of email.
>
> Adding .owl in as a default extension to mean RDF/XML does not break any of
> the tests.
>
>         Andy
>

this might be a good solution until the community has come to some
consensus on file extensions.

let me say thank you to the entire team for the new release. well done!


-- 


---
Marco Neumann
KONA