You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jena.apache.org by Michael Brunnbauer <br...@netestate.de> on 2012/10/17 17:51:36 UTC

Feeding triples from java-rdfa-0.4.2 + Jena into fuseki-0.2.5-20120916.055428-41

Hello Andy,

[Fuseki does not accept RDF generated with Jena]
On Wed, Sep 19, 2012 at 04:16:33PM +0100, Andy Seaborne wrote:
> >What would be the best way to check for such things in the same way as 
> >Fuseki ?
> See:
> org.openjena.riot.system.IRIResolver.

We are now checking every IRI in the graphs with IRIResolver (btw the 
documentation does not state that IRIResolver.checkIRI will return False
instead of True for a correct URI).

Here comes the next problem: Fuseki says "not a legal instance of Data type
XSD:hexBinary" if I try to load the attached file generated with Jena and
java-rdfa-0.4.2 from http://csarven.ca/7

17:13:46 INFO  Fuseki               :: [148664] POST http://ts.foaf-search.net:3030/crawl/update
17:13:46 WARN  SPARQL_Update$HttpActionUpdate :: Transaction still active in endWriter - no commit or abort seen (forced abort)
17:13:46 WARN  Fuseki               :: [148664] RC = 500 : Lexical form 'C7 2A 28 FD 69 34 C0 62 FF 05 C9 7E 85 75 04 70
56 B4 CD AD 3A 6C AE 37 6F 50 6A D4 F4 8F 91 C9
A9 E2 C2 7E 93 3C 39 01 E7 CC A3 3F CF 4A AA 3B
3A 54 F3 AF 53 E2 5D D8 D4 9B 53 C3 3A 19 25 57
EF AC 3E 84 0A 7C B1 7E B7 31 D8 D3 A1 83 D0 BC
A8 7C EE 18 E2 E7 89 B2 59 4A 02 45 EF ED D6 CB
7A 55 25 17 F6 EE 87 06 72 74 13 95 7F 75 49 97
80 84 24 A3 69 8E 2E 6A 7B DA 98 C9 05 C6 8C C3' is not a legal instance of Datatype[http://www.w3.org/2001/XMLSchema#hexBinary] Lexical form 'C7 2A 28 FD 69 34 C0 62 FF 05 C9 7E 85 75 04 70
56 B4 CD AD 3A 6C AE 37 6F 50 6A D4 F4 8F 91 C9
A9 E2 C2 7E 93 3C 39 01 E7 CC A3 3F CF 4A AA 3B
3A 54 F3 AF 53 E2 5D D8 D4 9B 53 C3 3A 19 25 57
EF AC 3E 84 0A 7C B1 7E B7 31 D8 D3 A1 83 D0 BC
A8 7C EE 18 E2 E7 89 B2 59 4A 02 45 EF ED D6 CB
7A 55 25 17 F6 EE 87 06 72 74 13 95 7F 75 49 97
80 84 24 A3 69 8E 2E 6A 7B DA 98 C9 05 C6 8C C3' is not a legal instance of Datatype[http://www.w3.org/2001/XMLSchema#hexBinary] during parse -org.apache.xerces.impl.dv.InvalidDatatypeValueException: cvc-datatype-valid.1.2.1: 'C7 2A 28 FD 69 34 C0 62 FF 05 C9 7E 85 75 04 70
56 B4 CD AD 3A 6C AE 37 6F 50 6A D4 F4 8F 91 C9
A9 E2 C2 7E 93 3C 39 01 E7 CC A3 3F CF 4A AA 3B
3A 54 F3 AF 53 E2 5D D8 D4 9B 53 C3 3A 19 25 57
EF AC 3E 84 0A 7C B1 7E B7 31 D8 D3 A1 83 D0 BC
A8 7C EE 18 E2 E7 89 B2 59 4A 02 45 EF ED D6 CB
7A 55 25 17 F6 EE 87 06 72 74 13 95 7F 75 49 97
80 84 24 A3 69 8E 2E 6A 7B DA 98 C9 05 C6 8C C3' is not a valid value for 'hexBinary'.
com.hp.hpl.jena.datatypes.DatatypeFormatException: Lexical form 'C7 2A 28 FD 69 34 C0 62 FF 05 C9 7E 85 75 04 70
56 B4 CD AD 3A 6C AE 37 6F 50 6A D4 F4 8F 91 C9
A9 E2 C2 7E 93 3C 39 01 E7 CC A3 3F CF 4A AA 3B
3A 54 F3 AF 53 E2 5D D8 D4 9B 53 C3 3A 19 25 57
EF AC 3E 84 0A 7C B1 7E B7 31 D8 D3 A1 83 D0 BC
A8 7C EE 18 E2 E7 89 B2 59 4A 02 45 EF ED D6 CB
7A 55 25 17 F6 EE 87 06 72 74 13 95 7F 75 49 97
80 84 24 A3 69 8E 2E 6A 7B DA 98 C9 05 C6 8C C3' is not a legal instance of Datatype[http://www.w3.org/2001/XMLSchema#hexBinary] Lexical form 'C7 2A 28 FD 69 34 C0 62 FF 05 C9 7E 85 75 04 70
56 B4 CD AD 3A 6C AE 37 6F 50 6A D4 F4 8F 91 C9
A9 E2 C2 7E 93 3C 39 01 E7 CC A3 3F CF 4A AA 3B
3A 54 F3 AF 53 E2 5D D8 D4 9B 53 C3 3A 19 25 57
EF AC 3E 84 0A 7C B1 7E B7 31 D8 D3 A1 83 D0 BC
A8 7C EE 18 E2 E7 89 B2 59 4A 02 45 EF ED D6 CB
7A 55 25 17 F6 EE 87 06 72 74 13 95 7F 75 49 97
80 84 24 A3 69 8E 2E 6A 7B DA 98 C9 05 C6 8C C3' is not a legal instance of Datatype[http://www.w3.org/2001/XMLSchema#hexBinary] during parse -org.apache.xerces.impl.dv.InvalidDatatypeValueException: cvc-datatype-valid.1.2.1: 'C7 2A 28 FD 69 34 C0 62 FF 05 C9 7E 85 75 04 70
56 B4 CD AD 3A 6C AE 37 6F 50 6A D4 F4 8F 91 C9
A9 E2 C2 7E 93 3C 39 01 E7 CC A3 3F CF 4A AA 3B
3A 54 F3 AF 53 E2 5D D8 D4 9B 53 C3 3A 19 25 57
EF AC 3E 84 0A 7C B1 7E B7 31 D8 D3 A1 83 D0 BC
A8 7C EE 18 E2 E7 89 B2 59 4A 02 45 EF ED D6 CB
7A 55 25 17 F6 EE 87 06 72 74 13 95 7F 75 49 97
80 84 24 A3 69 8E 2E 6A 7B DA 98 C9 05 C6 8C C3' is not a valid value for 'hexBinary'.
	at com.hp.hpl.jena.graph.impl.LiteralLabelImpl.getValue(LiteralLabelImpl.java:326)
	at com.hp.hpl.jena.datatypes.xsd.XSDhexBinary.getHashCode(XSDhexBinary.java:81)
	at com.hp.hpl.jena.graph.impl.LiteralLabelImpl.hashCode(LiteralLabelImpl.java:448)
	at com.hp.hpl.jena.graph.Node.hashCode(Node.java:331)
	at java.util.HashMap.getEntry(HashMap.java:344)
	at java.util.HashMap.containsKey(HashMap.java:335)
	at org.openjena.atlas.lib.cache.CacheSetLRU.contains(CacheSetLRU.java:82)
	at com.hp.hpl.jena.tdb.nodetable.NodeTableCache.cacheLookup(NodeTableCache.java:147)
	at com.hp.hpl.jena.tdb.nodetable.NodeTableCache._idForNode(NodeTableCache.java:118)
	at com.hp.hpl.jena.tdb.nodetable.NodeTableCache.getNodeIdForNode(NodeTableCache.java:79)
	at com.hp.hpl.jena.tdb.nodetable.NodeTableWrapper.getNodeIdForNode(NodeTableWrapper.java:49)
	at com.hp.hpl.jena.tdb.nodetable.NodeTableInline.getNodeIdForNode(NodeTableInline.java:59)
	at com.hp.hpl.jena.tdb.transaction.NodeTableTrans.getNodeIdForNode(NodeTableTrans.java:97)
	at com.hp.hpl.jena.tdb.transaction.NodeTableTrans.getAllocateNodeId(NodeTableTrans.java:85)
	at com.hp.hpl.jena.tdb.nodetable.NodeTableWrapper.getAllocateNodeId(NodeTableWrapper.java:43)
	at com.hp.hpl.jena.tdb.nodetable.NodeTableInline.getAllocateNodeId(NodeTableInline.java:51)
	at com.hp.hpl.jena.tdb.nodetable.NodeTupleTableConcrete.addRow(NodeTupleTableConcrete.java:84)
	at com.hp.hpl.jena.tdb.store.QuadTable.add(QuadTable.java:63)
	at com.hp.hpl.jena.tdb.store.QuadTable.add(QuadTable.java:57)
	at com.hp.hpl.jena.tdb.store.GraphNamedTDB._performAdd(GraphNamedTDB.java:83)
	at com.hp.hpl.jena.tdb.store.GraphTDBBase.performAdd(GraphTDBBase.java:77)
	at com.hp.hpl.jena.graph.impl.SimpleBulkUpdateHandler.add(SimpleBulkUpdateHandler.java:63)
	at com.hp.hpl.jena.graph.impl.SimpleBulkUpdateHandler.addIterator(SimpleBulkUpdateHandler.java:75)
	at com.hp.hpl.jena.graph.impl.SimpleBulkUpdateHandler.add(SimpleBulkUpdateHandler.java:87)
	at com.hp.hpl.jena.graph.impl.SimpleBulkUpdateHandler.add(SimpleBulkUpdateHandler.java:81)
	at com.hp.hpl.jena.sparql.modify.UpdateEngineWorker.visit(UpdateEngineWorker.java:161)
	at com.hp.hpl.jena.sparql.modify.request.UpdateLoad.visit(UpdateLoad.java:59)
	at com.hp.hpl.jena.sparql.modify.UpdateEngineMain.execute(UpdateEngineMain.java:40)
	at com.hp.hpl.jena.sparql.modify.UpdateProcessorBase.execute(UpdateProcessorBase.java:56)
	at com.hp.hpl.jena.update.UpdateAction.execute$(UpdateAction.java:334)
	at com.hp.hpl.jena.update.UpdateAction.execute(UpdateAction.java:327)
	at com.hp.hpl.jena.update.UpdateAction.execute(UpdateAction.java:307)
	at com.hp.hpl.jena.update.UpdateAction.execute(UpdateAction.java:257)
	at org.apache.jena.fuseki.servlets.SPARQL_Update.execute(SPARQL_Update.java:242)
	at org.apache.jena.fuseki.servlets.SPARQL_Update.executeForm(SPARQL_Update.java:232)
	at org.apache.jena.fuseki.servlets.SPARQL_Update.perform(SPARQL_Update.java:118)
	at org.apache.jena.fuseki.servlets.SPARQL_ServletBase.doCommonWorker(SPARQL_ServletBase.java:117)
	at org.apache.jena.fuseki.servlets.SPARQL_ServletBase.doCommon(SPARQL_ServletBase.java:67)
	at org.apache.jena.fuseki.servlets.SPARQL_Update.doPost(SPARQL_Update.java:84)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:727)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
	at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:598)
	at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:442)
	at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
	at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1033)
	at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:369)
	at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:186)
	at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:967)
	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:129)
	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:111)
	at org.eclipse.jetty.server.Server.handle(Server.java:358)
	at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:452)
	at org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:47)
	at org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:894)
	at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:948)
	at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:851)
	at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
	at org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:66)
	at org.eclipse.jetty.server.nio.BlockingChannelConnector$BlockingChannelEndPoint.run(BlockingChannelConnector.java:293)
	at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:603)
	at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:538)
	at java.lang.Thread.run(Thread.java:662)
17:13:46 INFO  Fuseki               :: [148664] 500 Lexical form 'C7 2A 28 FD 69 34 C0 62 FF 05 C9 7E 85 75 04 70
56 B4 CD AD 3A 6C AE 37 6F 50 6A D4 F4 8F 91 C9
A9 E2 C2 7E 93 3C 39 01 E7 CC A3 3F CF 4A AA 3B
3A 54 F3 AF 53 E2 5D D8 D4 9B 53 C3 3A 19 25 57
EF AC 3E 84 0A 7C B1 7E B7 31 D8 D3 A1 83 D0 BC
A8 7C EE 18 E2 E7 89 B2 59 4A 02 45 EF ED D6 CB
7A 55 25 17 F6 EE 87 06 72 74 13 95 7F 75 49 97
80 84 24 A3 69 8E 2E 6A 7B DA 98 C9 05 C6 8C C3' is not a legal instance of Datatype[http://www.w3.org/2001/XMLSchema#hexBinary] Lexical form 'C7 2A 28 FD 69 34 C0 62 FF 05 C9 7E 85 75 04 70
56 B4 CD AD 3A 6C AE 37 6F 50 6A D4 F4 8F 91 C9
A9 E2 C2 7E 93 3C 39 01 E7 CC A3 3F CF 4A AA 3B
3A 54 F3 AF 53 E2 5D D8 D4 9B 53 C3 3A 19 25 57
EF AC 3E 84 0A 7C B1 7E B7 31 D8 D3 A1 83 D0 BC
A8 7C EE 18 E2 E7 89 B2 59 4A 02 45 EF ED D6 CB
7A 55 25 17 F6 EE 87 06 72 74 13 95 7F 75 49 97
80 84 24 A3 69 8E 2E 6A 7B DA 98 C9 05 C6 8C C3' is not a legal instance of Datatype[http://www.w3.org/2001/XMLSchema#hexBinary] during parse -org.apache.xerces.impl.dv.InvalidDatatypeValueException: cvc-datatype-valid.1.2.1: 'C7 2A 28 FD 69 34 C0 62 FF 05 C9 7E 85 75 04 70
56 B4 CD AD 3A 6C AE 37 6F 50 6A D4 F4 8F 91 C9
A9 E2 C2 7E 93 3C 39 01 E7 CC A3 3F CF 4A AA 3B
3A 54 F3 AF 53 E2 5D D8 D4 9B 53 C3 3A 19 25 57
EF AC 3E 84 0A 7C B1 7E B7 31 D8 D3 A1 83 D0 BC
A8 7C EE 18 E2 E7 89 B2 59 4A 02 45 EF ED D6 CB
7A 55 25 17 F6 EE 87 06 72 74 13 95 7F 75 49 97
80 84 24 A3 69 8E 2E 6A 7B DA 98 C9 05 C6 8C C3' is not a valid value for 'hexBinary'.

What should I do ? Would it be feasible to copy the Fuseki code checking the 
data into our code ?

Regards,

Michael Brunnbauer

-- 
++  Michael Brunnbauer
++  netEstate GmbH
++  Geisenhausener Stra�e 11a
++  81379 M�nchen
++  Tel +49 89 32 19 77 80
++  Fax +49 89 32 19 77 89 
++  E-Mail brunni@netestate.de
++  http://www.netestate.de/
++
++  Sitz: M�nchen, HRB Nr.142452 (Handelsregister B M�nchen)
++  USt-IdNr. DE221033342
++  Gesch�ftsf�hrer: Michael Brunnbauer, Franz Brunnbauer
++  Prokurist: Dipl. Kfm. (Univ.) Markus Hendel

Re: Feeding triples from java-rdfa-0.4.2 + Jena into fuseki-0.2.5-20120916.055428-41

Posted by Andy Seaborne <an...@apache.org>.
On 24/10/12 16:54, Damian Steer wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 24/10/12 13:52, Andy Seaborne wrote:
>
>> Not knowing how java-rdfa works, I guess that it is creating
>> langtags directly.  It does not hook into RIOT.  The validation
>> code only works for RIOT parsing (NT, Turtle, etc)
>
> Yep, it creates lang tags directly.
>
> Any pointers on hooking it into RIOT?

It's not done well.

org.openjena.riot.system.ParserProfileChecker.

Checking is does as nodes/triples are created - too hardwired.

This is wrong - it should be a Sink that does the checking (c.f. 
o.o.riot.pipeline which is half started/finished)

But it is right in that it has easy access to the line and column 
information for error messages.

	Andy

>
> Damian
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.11 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://www.enigmail.net/
>
> iEYEARECAAYFAlCID0IACgkQAyLCB+mTtymD/ACfdf528mL/tKO1ONW5F3Y5AITj
> j30An22U2WQHKshH6ozp8xpgQ3iVq/53
> =Frxo
> -----END PGP SIGNATURE-----
>


Re: Feeding triples from java-rdfa-0.4.2 + Jena into fuseki-0.2.5-20120916.055428-41

Posted by Damian Steer <d....@bristol.ac.uk>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 24/10/12 13:52, Andy Seaborne wrote:

> Not knowing how java-rdfa works, I guess that it is creating
> langtags directly.  It does not hook into RIOT.  The validation
> code only works for RIOT parsing (NT, Turtle, etc)

Yep, it creates lang tags directly.

Any pointers on hooking it into RIOT?

Damian

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://www.enigmail.net/

iEYEARECAAYFAlCID0IACgkQAyLCB+mTtymD/ACfdf528mL/tKO1ONW5F3Y5AITj
j30An22U2WQHKshH6ozp8xpgQ3iVq/53
=Frxo
-----END PGP SIGNATURE-----

Re: Feeding triples from java-rdfa-0.4.2 + Jena into fuseki-0.2.5-20120916.055428-41

Posted by Andy Seaborne <an...@apache.org>.
On 24/10/12 09:53, Michael Brunnbauer wrote:
>
> Hallo Andy,
>
> On Tue, Oct 23, 2012 at 09:32:58PM +0200, Michael Brunnbauer wrote:
>> I am talking about adding SysRIOT.wireIntoJena() to get the language tag
>> exception when parsing instead of when submitting to Fuseki via SPARQL update.
>
> This does not work. The strict parser is probably not wired into java-rdfa
> by this. Should I try to reparse the .nt file I created to get the exception ?

Not knowing how java-rdfa works, I guess that it is creating langtags 
directly.  It does not hook into RIOT.  The validation code only works 
for RIOT parsing (NT, Turtle, etc)

	Andy


>
> Regards,
>
> Michael Brunnbauer
>


Re: Feeding triples from java-rdfa-0.4.2 + Jena into fuseki-0.2.5-20120916.055428-41

Posted by Michael Brunnbauer <br...@netestate.de>.
Hallo Andy,

On Tue, Oct 23, 2012 at 09:32:58PM +0200, Michael Brunnbauer wrote:
> I am talking about adding SysRIOT.wireIntoJena() to get the language tag
> exception when parsing instead of when submitting to Fuseki via SPARQL update.

This does not work. The strict parser is probably not wired into java-rdfa
by this. Should I try to reparse the .nt file I created to get the exception ?

Regards,

Michael Brunnbauer

-- 
++  Michael Brunnbauer
++  netEstate GmbH
++  Geisenhausener Straße 11a
++  81379 München
++  Tel +49 89 32 19 77 80
++  Fax +49 89 32 19 77 89 
++  E-Mail brunni@netestate.de
++  http://www.netestate.de/
++
++  Sitz: München, HRB Nr.142452 (Handelsregister B München)
++  USt-IdNr. DE221033342
++  Geschäftsführer: Michael Brunnbauer, Franz Brunnbauer
++  Prokurist: Dipl. Kfm. (Univ.) Markus Hendel

Re: Feeding triples from java-rdfa-0.4.2 + Jena into fuseki-0.2.5-20120916.055428-41

Posted by Andy Seaborne <an...@apache.org>.
On 23/10/12 20:32, Michael Brunnbauer wrote:
>
> Hello Andy,
>
> On Tue, Oct 23, 2012 at 08:14:00PM +0100, Andy Seaborne wrote:
>>> POST http://ts.foaf-search.net:3030/crawl/update
>> Is that a SPARQL Update, what exactly is it?  Is it a LOAD  -- or INSERT
>> DATA?
>
> It's a LOAD.

I've made Fuseki handle the exception more cleanly.  No stack trace.

>
>>> I just saw that some of my tools use SysRIOT.wireIntoJena() but the Crawler
>>> does not. Would that help here if I catch the Exceptions ?
>>
>> Yes.
>
> OK. I will add SysRIOT.wireIntoJena() to get this Exception when parsing.
>
> Will my recently added code to check every IRI in the graph with
> org.openjena.riot.system.IRIResolver be redundant, then ?
>
>> Various things try to wire in RIOT but if you use only core Jena it
>> might be (currently) possible to bypass the initialization.
>
> Hmm. Are we talking about the same thing ? I am talking about adding
> SysRIOT.wireIntoJena() to get the language tag exception when parsing instead
> of when submitting to Fuseki via SPARQL update.
>
> Regards,
>
> Michael Brunnbauer
>


Re: Feeding triples from java-rdfa-0.4.2 + Jena into fuseki-0.2.5-20120916.055428-41

Posted by Michael Brunnbauer <br...@netestate.de>.
Hello Andy,

On Tue, Oct 23, 2012 at 08:14:00PM +0100, Andy Seaborne wrote:
> > POST http://ts.foaf-search.net:3030/crawl/update
> Is that a SPARQL Update, what exactly is it?  Is it a LOAD  -- or INSERT 
> DATA?

It's a LOAD.

> >I just saw that some of my tools use SysRIOT.wireIntoJena() but the Crawler
> >does not. Would that help here if I catch the Exceptions ?
> 
> Yes.

OK. I will add SysRIOT.wireIntoJena() to get this Exception when parsing.

Will my recently added code to check every IRI in the graph with
org.openjena.riot.system.IRIResolver be redundant, then ?

> Various things try to wire in RIOT but if you use only core Jena it 
> might be (currently) possible to bypass the initialization. 

Hmm. Are we talking about the same thing ? I am talking about adding
SysRIOT.wireIntoJena() to get the language tag exception when parsing instead
of when submitting to Fuseki via SPARQL update.

Regards,

Michael Brunnbauer

-- 
++  Michael Brunnbauer
++  netEstate GmbH
++  Geisenhausener Straße 11a
++  81379 München
++  Tel +49 89 32 19 77 80
++  Fax +49 89 32 19 77 89 
++  E-Mail brunni@netestate.de
++  http://www.netestate.de/
++
++  Sitz: München, HRB Nr.142452 (Handelsregister B München)
++  USt-IdNr. DE221033342
++  Geschäftsführer: Michael Brunnbauer, Franz Brunnbauer
++  Prokurist: Dipl. Kfm. (Univ.) Markus Hendel

Re: Feeding triples from java-rdfa-0.4.2 + Jena into fuseki-0.2.5-20120916.055428-41

Posted by Andy Seaborne <an...@apache.org>.
On 23/10/12 19:43, Michael Brunnbauer wrote:
>
> hi all
>
> here is the next problem with java-rdfa-0.4.2, jena-2.7.3 and
> jena-fuseki-0.2.5-20121019.052315-69. This time, it is the language tag @2-ru
> in http://www.fizkult-ura.com/fitness/35. I tried to load the attached .nt
> file generated with java-rdfa-0.4.2 and jena-2.7.3 from that url to Fuseki.

 > POST http://ts.foaf-search.net:3030/crawl/update

Is that a SPARQL Update, what exactly is it?  Is it a LOAD  -- or INSERT 
DATA?

The system recovered (no data loaded).



A minimal example would be:

-------------
<http://example/s> <http://example/p>  ""@2-ru .
-------------
and command:

riot --validate D.nt


Language tags can't have digits at that location.

rfc5646 / bcp47 has the grammar from hell in it but the WG has decided 
not adopt it (it's as big as turtle itself!)

Turtle uses

'@' [a-zA-Z]+ ('-' [a-zA-Z0-9]+)*

The first part is ascii-alpha only.

> I just saw that some of my tools use SysRIOT.wireIntoJena() but the Crawler
> does not. Would that help here if I catch the Exceptions ?

Yes.

Various things try to wire in RIOT but if you use only core Jena it 
might be (currently) possible to bypass the initialization.  We've been 
discussing making jena-core pull in RIOT always via reflection, so we 
can do it now in advance of any code organsiation.


>
> On a side note: I got problems with a tel: URI ending with space today from
> Henry Story's homepage (already corrected by Henry):
>
>   <a rel="foaf:phone" href="tel:+15106981206 ">

Yes - spaces are illegal in URIs for all URI schemes always.

>
> java-rdfa-0.4.2 produced that URI with a trailing space in the graph and the
> irichecker then refused it. Seems to be correct behaviour but Henry suggested
> that the parser maybe should strip the space. Damian ?
>
> Here is the Fuseki log for the first problem:
>
> 19:59:06 INFO  Fuseki               :: [190665] POST http://ts.foaf-search.net:3030/crawl/update
> 19:59:06 ERROR riot                 :: [line: 5, col: 204] Bad language tag
> 19:59:06 WARN  SPARQL_Update$HttpActionUpdate :: Transaction still active in endWriter - no commit or abort seen (forced abort)
> 19:59:06 WARN  Fuseki               :: [190665] RC = 500 : org.openjena.riot.RiotException: [line: 5, col: 204] Bad language tag
> com.hp.hpl.jena.shared.JenaException: org.openjena.riot.RiotException: [line: 5, col: 204] Bad language tag
> 	at org.openjena.riot.system.JenaReaderRIOT.readImpl(JenaReaderRIOT.java:150)

	Andy



Re: Feeding triples from java-rdfa-0.4.2 + Jena into fuseki-0.2.5-20120916.055428-41

Posted by Michael Brunnbauer <br...@netestate.de>.
hi all

here is the next problem with java-rdfa-0.4.2, jena-2.7.3 and
jena-fuseki-0.2.5-20121019.052315-69. This time, it is the language tag @2-ru
in http://www.fizkult-ura.com/fitness/35. I tried to load the attached .nt 
file generated with java-rdfa-0.4.2 and jena-2.7.3 from that url to Fuseki.

I just saw that some of my tools use SysRIOT.wireIntoJena() but the Crawler
does not. Would that help here if I catch the Exceptions ?

On a side note: I got problems with a tel: URI ending with space today from
Henry Story's homepage (already corrected by Henry):

 <a rel="foaf:phone" href="tel:+15106981206 ">

java-rdfa-0.4.2 produced that URI with a trailing space in the graph and the 
irichecker then refused it. Seems to be correct behaviour but Henry suggested
that the parser maybe should strip the space. Damian ?

Here is the Fuseki log for the first problem:

19:59:06 INFO  Fuseki               :: [190665] POST http://ts.foaf-search.net:3030/crawl/update
19:59:06 ERROR riot                 :: [line: 5, col: 204] Bad language tag
19:59:06 WARN  SPARQL_Update$HttpActionUpdate :: Transaction still active in endWriter - no commit or abort seen (forced abort)
19:59:06 WARN  Fuseki               :: [190665] RC = 500 : org.openjena.riot.RiotException: [line: 5, col: 204] Bad language tag
com.hp.hpl.jena.shared.JenaException: org.openjena.riot.RiotException: [line: 5, col: 204] Bad language tag
	at org.openjena.riot.system.JenaReaderRIOT.readImpl(JenaReaderRIOT.java:150)
	at org.openjena.riot.system.JenaReaderRIOT.read(JenaReaderRIOT.java:95)
	at com.hp.hpl.jena.rdf.model.impl.ModelCom.read(ModelCom.java:268)
	at com.hp.hpl.jena.util.FileManager.readModelWorker(FileManager.java:403)
	at com.hp.hpl.jena.util.FileManager.loadModelWorker(FileManager.java:306)
	at com.hp.hpl.jena.util.FileManager.loadModel(FileManager.java:258)
	at com.hp.hpl.jena.sparql.modify.UpdateEngineWorker.visit(UpdateEngineWorker.java:159)
	at com.hp.hpl.jena.sparql.modify.request.UpdateLoad.visit(UpdateLoad.java:59)
	at com.hp.hpl.jena.sparql.modify.UpdateEngineMain.execute(UpdateEngineMain.java:40)
	at com.hp.hpl.jena.sparql.modify.UpdateProcessorBase.execute(UpdateProcessorBase.java:56)
	at com.hp.hpl.jena.update.UpdateAction.execute$(UpdateAction.java:334)
	at com.hp.hpl.jena.update.UpdateAction.execute(UpdateAction.java:327)
	at com.hp.hpl.jena.update.UpdateAction.execute(UpdateAction.java:307)
	at com.hp.hpl.jena.update.UpdateAction.execute(UpdateAction.java:257)
	at org.apache.jena.fuseki.servlets.SPARQL_Update.execute(SPARQL_Update.java:240)
	at org.apache.jena.fuseki.servlets.SPARQL_Update.executeForm(SPARQL_Update.java:230)
	at org.apache.jena.fuseki.servlets.SPARQL_Update.perform(SPARQL_Update.java:118)
	at org.apache.jena.fuseki.servlets.SPARQL_ServletBase.doCommonWorker(SPARQL_ServletBase.java:117)
	at org.apache.jena.fuseki.servlets.SPARQL_ServletBase.doCommon(SPARQL_ServletBase.java:67)
	at org.apache.jena.fuseki.servlets.SPARQL_Update.doPost(SPARQL_Update.java:84)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:727)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
	at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:598)
	at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:442)
	at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
	at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1033)
	at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:369)
	at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:186)
	at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:967)
	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:129)
	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:111)
	at org.eclipse.jetty.server.Server.handle(Server.java:358)
	at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:452)
	at org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:47)
	at org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:894)
	at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:948)
	at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:851)
	at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
	at org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:66)
	at org.eclipse.jetty.server.nio.BlockingChannelConnector$BlockingChannelEndPoint.run(BlockingChannelConnector.java:293)
	at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:603)
	at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:538)
	at java.lang.Thread.run(Thread.java:662)
Caused by: org.openjena.riot.RiotException: [line: 5, col: 204] Bad language tag
	at org.openjena.riot.ErrorHandlerFactory$ErrorHandlerStd.fatal(ErrorHandlerFactory.java:130)
	at org.openjena.riot.lang.LangEngine.raiseException(LangEngine.java:169)
	at org.openjena.riot.lang.LangEngine.nextToken(LangEngine.java:116)
	at org.openjena.riot.lang.LangNTriples.parseOne(LangNTriples.java:57)
	at org.openjena.riot.lang.LangNTriples.parseOne(LangNTriples.java:33)
	at org.openjena.riot.lang.LangNTuple.runParser(LangNTuple.java:69)
	at org.openjena.riot.lang.LangBase.parse(LangBase.java:43)
	at org.openjena.riot.system.JenaReaderNTriples2.readWorker(JenaReaderNTriples2.java:39)
	at org.openjena.riot.system.JenaReaderRIOT.readImpl(JenaReaderRIOT.java:138)
	... 42 more
19:59:06 INFO  Fuseki               :: [190665] 500 org.openjena.riot.RiotException: [line: 5, col: 204] Bad language tag

Regards,

Michael Brunnbauer

On Thu, Oct 18, 2012 at 04:37:39PM +0200, Michael Brunnbauer wrote:
> 
> Hello Andy,
> 
> On Thu, Oct 18, 2012 at 03:31:16PM +0100, Andy Seaborne wrote:
> > > Recorded as JENA-335.
> > Fixed - this is a fix to jena-core in the datatype code (details in the 
> > JIRA).
> > In tonight's snapshot builds.
> 
> Thanks! I will try the lastest fuseki snapshot tomorrow.
> 
> Regards,
> 
> Michael Brunnbauer
> 
> -- 
> ++  Michael Brunnbauer
> ++  netEstate GmbH
> ++  Geisenhausener Stra�e 11a
> ++  81379 M�nchen
> ++  Tel +49 89 32 19 77 80
> ++  Fax +49 89 32 19 77 89 
> ++  E-Mail brunni@netestate.de
> ++  http://www.netestate.de/
> ++
> ++  Sitz: M�nchen, HRB Nr.142452 (Handelsregister B M�nchen)
> ++  USt-IdNr. DE221033342
> ++  Gesch�ftsf�hrer: Michael Brunnbauer, Franz Brunnbauer
> ++  Prokurist: Dipl. Kfm. (Univ.) Markus Hendel

-- 
++  Michael Brunnbauer
++  netEstate GmbH
++  Geisenhausener Stra�e 11a
++  81379 M�nchen
++  Tel +49 89 32 19 77 80
++  Fax +49 89 32 19 77 89 
++  E-Mail brunni@netestate.de
++  http://www.netestate.de/
++
++  Sitz: M�nchen, HRB Nr.142452 (Handelsregister B M�nchen)
++  USt-IdNr. DE221033342
++  Gesch�ftsf�hrer: Michael Brunnbauer, Franz Brunnbauer
++  Prokurist: Dipl. Kfm. (Univ.) Markus Hendel

Re: Feeding triples from java-rdfa-0.4.2 + Jena into fuseki-0.2.5-20120916.055428-41

Posted by Michael Brunnbauer <br...@netestate.de>.
Hello Andy,

On Thu, Oct 18, 2012 at 03:31:16PM +0100, Andy Seaborne wrote:
> > Recorded as JENA-335.
> Fixed - this is a fix to jena-core in the datatype code (details in the 
> JIRA).
> In tonight's snapshot builds.

Thanks! I will try the lastest fuseki snapshot tomorrow.

Regards,

Michael Brunnbauer

-- 
++  Michael Brunnbauer
++  netEstate GmbH
++  Geisenhausener Straße 11a
++  81379 München
++  Tel +49 89 32 19 77 80
++  Fax +49 89 32 19 77 89 
++  E-Mail brunni@netestate.de
++  http://www.netestate.de/
++
++  Sitz: München, HRB Nr.142452 (Handelsregister B München)
++  USt-IdNr. DE221033342
++  Geschäftsführer: Michael Brunnbauer, Franz Brunnbauer
++  Prokurist: Dipl. Kfm. (Univ.) Markus Hendel

Re: Feeding triples from java-rdfa-0.4.2 + Jena into fuseki-0.2.5-20120916.055428-41

Posted by Andy Seaborne <an...@apache.org>.
 >> 17:13:46 WARN  Fuseki               :: [148664] RC = 500 : Lexical
 >> form 'C7 2A 28 FD 69 34 C0 62 FF 05 C9 7E 85 75 04 70
 >> 56 B4 CD AD 3A 6C AE 37 6F 50 6A D4 F4 8F 91 C9
 >> A9 E2 C2 7E 93 3C 39 01 E7 CC A3 3F CF 4A AA 3B
 >> 3A 54 F3 AF 53 E2 5D D8 D4 9B 53 C3 3A 19 25 57
 >
 > ....
 >> 05 C6 8C C3' is not a valid value for 'hexBinary'.
 >>     at
 >> 
com.hp.hpl.jena.graph.impl.LiteralLabelImpl.getValue(LiteralLabelImpl.java:326) 

 >>
 >>     at
 >> 
com.hp.hpl.jena.datatypes.xsd.XSDhexBinary.getHashCode(XSDhexBinary.java:81) 

 >>
 >>     at
 >> 
com.hp.hpl.jena.graph.impl.LiteralLabelImpl.hashCode(LiteralLabelImpl.java:448) 

 >>
 >
 > That is a problem - at that point it should be tolerant.
 >
 > Something is odd about hexBinary because illegal lexical forms for other
 > things (e.g. xsd:float) do not preent this problem.
 >
 > Recorded as JENA-335.

Fixed - this is a fix to jena-core in the datatype code (details in the 
JIRA).

In tonight's snapshot builds.

	Andy

Re: Feeding triples from java-rdfa-0.4.2 + Jena into fuseki-0.2.5-20120916.055428-41

Posted by Andy Seaborne <an...@apache.org>.
On 17/10/12 16:51, Michael Brunnbauer wrote:
>
> Hello Andy,
>
> [Fuseki does not accept RDF generated with Jena]

java-rdfa-0.4.2 isn't Jena.

> On Wed, Sep 19, 2012 at 04:16:33PM +0100, Andy Seaborne wrote:
>>> What would be the best way to check for such things in the same way as
>>> Fuseki ?
>> See:
>> org.openjena.riot.system.IRIResolver.
>
> We are now checking every IRI in the graphs with IRIResolver (btw the
> documentation does not state that IRIResolver.checkIRI will return False
> instead of True for a correct URI).
>
> Here comes the next problem: Fuseki says "not a legal instance of Data type
> XSD:hexBinary" if I try to load the attached file generated with Jena and
> java-rdfa-0.4.2 from http://csarven.ca/7

which indeed has xsd:hexBinary like:

"AB 12"^^xsd:hexBinary

XSD hex binary does not allow spaces in the lexical form (we may wish it 
did ... but it doesn't and it catches people out).

http://www.w3.org/TR/xmlschema-2/#hexBinary

But the check only causes the warning - it's the .hashCode that is the 
problem.

"riot --validate" gives a warning.

>
> 17:13:46 INFO  Fuseki               :: [148664] POST http://ts.foaf-search.net:3030/crawl/update
> 17:13:46 WARN  SPARQL_Update$HttpActionUpdate :: Transaction still active in endWriter - no commit or abort seen (forced abort)
> 17:13:46 WARN  Fuseki               :: [148664] RC = 500 : Lexical form 'C7 2A 28 FD 69 34 C0 62 FF 05 C9 7E 85 75 04 70
> 56 B4 CD AD 3A 6C AE 37 6F 50 6A D4 F4 8F 91 C9
> A9 E2 C2 7E 93 3C 39 01 E7 CC A3 3F CF 4A AA 3B
> 3A 54 F3 AF 53 E2 5D D8 D4 9B 53 C3 3A 19 25 57

....
> 05 C6 8C C3' is not a valid value for 'hexBinary'.
> 	at com.hp.hpl.jena.graph.impl.LiteralLabelImpl.getValue(LiteralLabelImpl.java:326)
> 	at com.hp.hpl.jena.datatypes.xsd.XSDhexBinary.getHashCode(XSDhexBinary.java:81)
> 	at com.hp.hpl.jena.graph.impl.LiteralLabelImpl.hashCode(LiteralLabelImpl.java:448)

That is a problem - at that point it should be tolerant.

Something is odd about hexBinary because illegal lexical forms for other 
things (e.g. xsd:float) do not preent this problem.

Recorded as JENA-335.


>
> What should I do ? Would it be feasible to copy the Fuseki code checking the
> data into our code ?

Get http://csarven.ca/7 corrected.

And you are welcome to copy the checking code - it's open source.  The 
key word there is "open".  And "source".

	Andy

> Regards,
>
> Michael Brunnbauer
>