You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jena.apache.org by Andy Seaborne <an...@apache.org> on 2013/09/22 18:28:35 UTC
Re: apache-jena-2.11.0 tdbloader2 RIOT Reader Execption with RDF/XML
files [.owl]
On 22/09/13 15:41, Marco Neumann wrote:
> had to rename the static *.owl files to *.rdf and now the bulk load
> works just fine with tdbloader2 (apache-jena-2.11.0)
>
>
>
> On Sun, Sep 22, 2013 at 10:08 AM, Marco Neumann <ma...@gmail.com> wrote:
>> get the following tdbloader2 exception now with all my RDF/XML files
>> on apache-jena-2.11.0
>>
>> INFO Load: lotico.owl -- 2013/09/22 13:59:02 UTC
>> ERROR [line: 3, col: 1 ] Broken IRI (newline): rdf:RDF
>> xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>> org.apache.jena.riot.RiotException: [line: 3, col: 1 ] Broken IRI
>> (newline): rdf:RDF
>> xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>> at org.apache.jena.riot.system.ErrorHandlerFactory$ErrorHandlerStd.fatal(ErrorHandlerFactory.java:136)
>> at org.apache.jena.riot.lang.LangEngine.raiseException(LangEngine.java:163)
>> at org.apache.jena.riot.lang.LangEngine.nextToken(LangEngine.java:106)
>> at org.apache.jena.riot.lang.LangNTriples.parseOne(LangNTriples.java:63)
>> at org.apache.jena.riot.lang.LangNTriples.runParser(LangNTriples.java:54)
>> at org.apache.jena.riot.lang.LangBase.parse(LangBase.java:42)
>> at org.apache.jena.riot.RiotReader.parse(RiotReader.java:116)
>> at org.apache.jena.riot.RiotReader.parse(RiotReader.java:93)
>> at org.apache.jena.riot.RiotReader.parse(RiotReader.java:66)
>> at com.hp.hpl.jena.tdb.store.bulkloader2.CmdNodeTableBuilder.exec(CmdNodeTableBuilder.java:163)
>> at arq.cmdline.CmdMain.mainMethod(CmdMain.java:101)
>> at arq.cmdline.CmdMain.mainRun(CmdMain.java:63)
>> at arq.cmdline.CmdMain.mainRun(CmdMain.java:50)
>> at com.hp.hpl.jena.tdb.store.bulkloader2.CmdNodeTableBuilder.main(CmdNodeTableBuilder.java:81)
>>
>> --
>>
>>
>> ---
>> Marco Neumann
>> KONA
>
>
>
I'm happy to go with whatever is the community decision here - should
".owl" default to assuming RDF/XML?
There is no formal definition of ".owl". It is mentioned in OWL1 but not
in OWL2.
This came about from
http://mail-archives.apache.org/mod_mbox/jena-users/201308.mbox/%3C52169A5B.4060902%40knublauch.com%3E
but reviewing that it could be the initial analysis was wrong as the
report was expanded upon over several rounds of email.
Adding .owl in as a default extension to mean RDF/XML does not break any
of the tests.
Andy
Re: Registering .owl as RDF/XML
Posted by Andy Seaborne <an...@apache.org>.
See JENA-548
https://issues.apache.org/jira/browse/JENA-548
Registering .owl as RDF/XML
Posted by Andy Seaborne <an...@apache.org>.
I wrote a test program (at end) to investigate the difference when .owl
is registered as RDF/XML format and it is not.
The difference turns the case where the system has no idea of the
language into trying RDF/XML.
When the app provides a suggested language, this is used unless the far
end gives a non-text/plain content type. "text/plain" is too unreliable
to assume anything as it's what you get if you do nothing. RIOT does
not assume ISO-8859-1 in this case either.
In both cases, the server declared type is used. This covers the case
where the publisher changes the content type of a URL wihtou braking the
application that used to work.
Andy
== .owl not registered
URL=http://example/file.owl
Content-type=null
Lang=null
==> null
URL=http://example/file.owl
Content-type=null
Lang=Lang:Turtle
==> Lang:Turtle
URL=http://example/file.owl
Content-type=text/plain
Lang=null
==> null
URL=http://example/file.owl
Content-type=text/plain
Lang=Lang:Turtle
==> Lang:Turtle
URL=http://example/file.owl
Content-type=application/rdf+json
Lang=null
==> Lang:RDF/JSON
URL=http://example/file.owl
Content-type=application/rdf+json
Lang=Lang:Turtle
==> Lang:RDF/JSON
== With .owl registered
URL=http://example/file.owl
Content-type=null
Lang=null
==> Lang:RDF/XML
URL=http://example/file.owl
Content-type=null
Lang=Lang:Turtle
==> Lang:Turtle
URL=http://example/file.owl
Content-type=text/plain
Lang=null
==> Lang:RDF/XML
URL=http://example/file.owl
Content-type=text/plain
Lang=Lang:Turtle
==> Lang:Turtle
URL=http://example/file.owl
Content-type=application/rdf+json
Lang=null
==> Lang:RDF/JSON
URL=http://example/file.owl
Content-type=application/rdf+json
Lang=Lang:Turtle
==> Lang:RDF/JSON
public static void main(String ... argv) throws Exception
{
test("http://example/file.owl",
null, null) ;
test("http://example/file.owl",
null, Lang.TTL) ;
test("http://example/file.owl",
WebContent.contentTypeTextPlain, null) ;
test("http://example/file.owl",
WebContent.contentTypeTextPlain, Lang.TTL) ;
test("http://example/file.owl",
WebContent.contentTypeRDFJSON, null) ;
test("http://example/file.owl",
WebContent.contentTypeRDFJSON, Lang.TTL) ;
}
public static void test(String url, String ctStr, Lang lang) {
System.out.printf(
"URL=%s\nContent-type=%s\nLang=%s\n", url, ctStr, lang) ;
Lang lang2 = RDFDataMgr.determineLang(url, ctStr, lang) ;
System.out.println(" ==> "+lang2) ;
System.out.println() ;
}
Re: apache-jena-2.11.0 tdbloader2 RIOT Reader Execption with RDF/XML
files [.owl]
Posted by Andy Seaborne <an...@apache.org>.
>> I'm happy to go with whatever is the community decision here - should ".owl"
>> default to assuming RDF/XML?
>
> neither am I, maybe time to phase out the owl extension. in what
> serialization does jena read the owl file in my example?
N-Quads - the default for tdbloader
(think pipeline
gzip -d < FILE.nq.gz | tdbloader -- -
)
>
>> There is no formal definition of ".owl". It is mentioned in OWL1 but not in
>> OWL2.
>
> if it is used I think we should assume a RDF/XML serialization
>
>> This came about from
>>
>> http://mail-archives.apache.org/mod_mbox/jena-users/201308.mbox/%3C52169A5B.4060902%40knublauch.com%3E
>>
>
> I share Holgers concerns here but would not go along with the
> assumption to find a ttl serialization in an .owl file.
>
>> but reviewing that it could be the initial analysis was wrong as the report
>> was expanded upon over several rounds of email.
>>
>> Adding .owl in as a default extension to mean RDF/XML does not break any of
>> the tests.
>>
>> Andy
>>
>
> this might be a good solution until the community has come to some
> consensus on file extensions.
>
> let me say thank you to the entire team for the new release. well done!
On behalf of everyone, thank you.
>
>
Andy
Re: apache-jena-2.11.0 tdbloader2 RIOT Reader Execption with RDF/XML
files [.owl]
Posted by Marco Neumann <ma...@gmail.com>.
On Sun, Sep 22, 2013 at 12:28 PM, Andy Seaborne <an...@apache.org> wrote:
> On 22/09/13 15:41, Marco Neumann wrote:
>>
>> had to rename the static *.owl files to *.rdf and now the bulk load
>> works just fine with tdbloader2 (apache-jena-2.11.0)
>>
>>
>>
>> On Sun, Sep 22, 2013 at 10:08 AM, Marco Neumann <ma...@gmail.com>
>> wrote:
>>>
>>> get the following tdbloader2 exception now with all my RDF/XML files
>>> on apache-jena-2.11.0
>>>
>>> INFO Load: lotico.owl -- 2013/09/22 13:59:02 UTC
>>> ERROR [line: 3, col: 1 ] Broken IRI (newline): rdf:RDF
>>> xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>>> org.apache.jena.riot.RiotException: [line: 3, col: 1 ] Broken IRI
>>> (newline): rdf:RDF
>>> xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>>> at
>>> org.apache.jena.riot.system.ErrorHandlerFactory$ErrorHandlerStd.fatal(ErrorHandlerFactory.java:136)
>>> at
>>> org.apache.jena.riot.lang.LangEngine.raiseException(LangEngine.java:163)
>>> at
>>> org.apache.jena.riot.lang.LangEngine.nextToken(LangEngine.java:106)
>>> at
>>> org.apache.jena.riot.lang.LangNTriples.parseOne(LangNTriples.java:63)
>>> at
>>> org.apache.jena.riot.lang.LangNTriples.runParser(LangNTriples.java:54)
>>> at org.apache.jena.riot.lang.LangBase.parse(LangBase.java:42)
>>> at org.apache.jena.riot.RiotReader.parse(RiotReader.java:116)
>>> at org.apache.jena.riot.RiotReader.parse(RiotReader.java:93)
>>> at org.apache.jena.riot.RiotReader.parse(RiotReader.java:66)
>>> at
>>> com.hp.hpl.jena.tdb.store.bulkloader2.CmdNodeTableBuilder.exec(CmdNodeTableBuilder.java:163)
>>> at arq.cmdline.CmdMain.mainMethod(CmdMain.java:101)
>>> at arq.cmdline.CmdMain.mainRun(CmdMain.java:63)
>>> at arq.cmdline.CmdMain.mainRun(CmdMain.java:50)
>>> at
>>> com.hp.hpl.jena.tdb.store.bulkloader2.CmdNodeTableBuilder.main(CmdNodeTableBuilder.java:81)
>>>
>>> --
>>>
>>>
>>> ---
>>> Marco Neumann
>>> KONA
>>
>>
>>
>>
>
> I'm happy to go with whatever is the community decision here - should ".owl"
> default to assuming RDF/XML?
neither am I, maybe time to phase out the owl extension. in what
serialization does jena read the owl file in my example?
> There is no formal definition of ".owl". It is mentioned in OWL1 but not in
> OWL2.
if it is used I think we should assume a RDF/XML serialization
> This came about from
>
> http://mail-archives.apache.org/mod_mbox/jena-users/201308.mbox/%3C52169A5B.4060902%40knublauch.com%3E
>
I share Holgers concerns here but would not go along with the
assumption to find a ttl serialization in an .owl file.
> but reviewing that it could be the initial analysis was wrong as the report
> was expanded upon over several rounds of email.
>
> Adding .owl in as a default extension to mean RDF/XML does not break any of
> the tests.
>
> Andy
>
this might be a good solution until the community has come to some
consensus on file extensions.
let me say thank you to the entire team for the new release. well done!
--
---
Marco Neumann
KONA