You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@jena.apache.org by Elli Schwarz <el...@yahoo.com> on 2013/06/25 22:35:48 UTC

JENA-378 Redux

This past January, I reported a bug to this list which was recorded as JENA-378. I'm now experiencing what appears to be the same problem, where [ ] syntax in an Insert script doesn't work when using UpdateExecutionFactory:

  String updateString = "INSERT {} WHERE { ?x ?p [ ?a  ?b ] }";
  UpdateRequest update = UpdateFactory.create(updateString);

  UpdateProcessor up = UpdateExecutionFactory.createRemote(update,
      "http://localhost:3131/ds/update");
  up.execute();

The error is: 400 Encountered " "?" "? "" 
caused by the client generating incorrect SPARQL with an extra ? (as viewed from the Fuseki log):  INSERT { } WHERE   { ?x ?p ??0 . ??0 ?a ?b   } 

This is with jena-core & jena-arg  2.10.2-SNAPSHOT, and with jena-fuseki 0.2.8-SNAPSHOT (compiled today). 
--
Another problem I'm having which I can't track down is that the following code takes a VERY long time to execute (10 minutes):
DatasetAccessorFactory.createHTTP("http://localhost:3131/ds/update").getModel(modelName);

With earlier versions of Fuseki, it would take seconds, with the same data. The problem seems to be related to my Fuseki server instance itself, which is 0.2.8-SNAPSHOT (r1496513), and not to my client code, since even if I use the older stable jena-core and jena-arq 2.10.0 and jena-fuseki 0.2.6, I also have the problem (but not if I connect it to an earlier Fuseki release). Upon debugging, it appears that for some reason the HTTP request itself is taking a long time to complete. In fact, I'm not even getting anything in the Fuseki log for about a minute after the request is made, but once the request is made I immediately see a spike in CPU usage on the server. This doesn't appear to be a network latency issue since other access to the server isn't affected, it appears to be just this call. It would seem that Fuseki is spinning its wheels on something. 

I realize this may not be enough info for you to determine what is causing the problem, but I don't know how else to track down the issue. Using s-get I can get back the data quickly, which is strange since I though it would be doing the same thing as the getModel().

Thank you,
Elli

Re: Problem with Fuseki generating RDF/XML

Posted by Andy Seaborne <an...@apache.org>.

On 28/06/13 14:19, Elli Schwarz wrote:
> Andy,
>
> As always, I really appreciate your prompt response and fixes. I
> continue to be amazed at how quickly Jena responds to bugs and even
> feature requests.

Sometimes its easier to just do the change rather than the paper work 
and risk forgetting it :-)

This wasn't an intentional change.

> And again, jena-text integration is crucial for my
> project, so I greatly appreciate the integration of this work to replace
> Fuseki/LARQ.
>
> I tried rebuilding this morning, and yes, efficiency is greatly improved
> by not using RDF/XML-ABBREV. (It appears that s-get uses Turtle by
> default now...)

Yes :-)

> BTW, I'm a big fan of JSON-LD as eclipsing RDF/XML. I currently use
> jsonld-java for that, and I believe you mentioned to me on that forum
> that you hope to have that fully integrated into Jena at some point. The
> biggest selling point for me is that I am able to give my data as
> JSON-LD to customers and they are able to adapt to use it very easily as
> regular JSON, without them even knowing that they are actually working
> with RDF (though I feel a bit guilty about the subterfuge ;-).

There is an adapter at

https://github.com/afs/jena-jsonld

with example of integration (== call JenaJSONLD.init())

which is using

https://github.com/jsonld-java/jsonld-java

to do all the real JSON-LD work.

One issue with JSON-LD is that if you work with it as JSON then it may 
not remain JSON-LD/RDF.  It's OK to read but an update to the JSON does 
not necessarily remain correct JSON-LD.

JSON-LD does not scale.  The processing model assumes you have the whole 
document available. It may be possible to write a direct 
JSON-LD->triples parser which is streaming but some of the other 
algorithms work on documents.  jsonld-java builds the JSON-LD in-memory 
first.

	Andy

>
> -Elli
>
>     ------------------------------------------------------------------------
>     *From:* Andy Seaborne <an...@apache.org>
>     *To:* users@jena.apache.org
>     *Sent:* Friday, June 28, 2013 6:17 AM
>     *Subject:* Re: Problem with Fuseki generating RDF/XML
>
>     Hi there,
>
>     I've switched back SPARQL Graph Store protocol GET to use plain RDF/XML.
>
>     Details:
>
>     The default when using RIOT to write in Lang.RDFXML is to use the
>     pretty
>     form.  i.e. when using RDFDataMgr.write(model,Lang.RDFXML).  RIOT
>     I/O is
>     not automatically used if available.
>
>     Fuseki uses new style RDFDataMgr, not model.write so got affetced by
>     the
>     change.
>
>     Writing model.write() isn't affected.
>
>     Yes - RDF/XML-ABBREV is  expensive.  I'm not completely sure why - the
>     Turtle writer is doing a similar, but not identical, analysis of the
>     model before writing.  However, the RDF/XML-ABBREV writer has more
>     choices and more options to consider.
>
>      >> is anyone really using
>      >> RDF/XML anymore as a human-readable format anyway?
>
>     Absolutely!
>
>     But, today, it's the standard.  Tomorrow, it won't be the only choice
>     and I'm guessing that Turtle-only toolkits will emerge.
>
>     Next ...
>
>     DatasetAccessor:
>
>     It does not seem to be setting the accept header at all so it gets the
>     default.  Which is application/rdf+xml.
>
>     I've recorded the need to set the accept header to a list based on
>     efficiency as:
>
>     https://issues.apache.org/jira/browse/JENA-481
>
>     I thinking the order should be N-triples, Turtle, RDF/XML, "whatever
>     you
>     can give me".
>
>     For reference, the accept string for reading RDF with
>     RDFDataMgr.loadModel(URL) or model.read(URL) is currently:
>
>     text/turtle,application/rdf+xml;q=0.9,application/xml;q=0.8,*/*;q=0.5;
>
>     Maybe that should include "application/n-triples" - including the
>     original MIME type of text/plain is distinctly unhelpful.
>
>          Andy
>
>
>     On 27/06/13 19:56, Rob Vesse wrote:
>      > Andy can probably give you a definitive answer here
>      >
>      > I know that there were significant improvements to the RDF output
>      > infrastructure made in 2.10.1 so my guess is that somehow the default
>      > RDF/XML output got switched as part of this upgrade (not necessarily
>      > intentionally).
>      >
>      > If this is the case Andy can likely make the fix easily, I
>     however don't
>      > know where to look for this setting.
>      >
>      > Rob
>      >
>      >
>      > On 6/27/13 11:38 AM, "Elli Schwarz" <eliezer_schwarz@yahoo.com
>     <ma...@yahoo.com>> wrote:
>      >
>      >> I think I may have tracked down what is causing my slow
>     performance of
>      >> GET with the new Fuseki 0.28 snapshot. Comparing the output of
>     s-get for
>      >> the same data from the latest Fuseki 0.28 snapshot, and from the
>     0.26
>      >> release, I discovered that the 0.28 snapshot is creating the XML in
>      >> hierarchical form, with nesting of elements (RDF/XML-ABBREV). In
>     Fuseki
>      >> 0.26, it would output the RDF in the regular flattened RDF/XML
>     format.
>      >> Obviously, creating the flattened form is much more efficient.
>      >>
>      >> While I understand that RDF/XML-ABBREV is more human readable,
>     there's a
>      >> big price to pay in efficiency, at least for my data. In my
>     case, I'm
>      >> accessing my Fuseki endpoint via datasetAccessor.getModel(), and
>     as far
>      >> as I know, there's no way for me to tell Fuseki through this API
>     that I
>      >> want the data to be serialized as N-TRIPLES (since it's just
>     going to be
>      >> loaded in a Jena model anyway and not read by a human). Is there
>     a way I
>      >> can control how Fuseki serializes by default? And why was the
>     default
>      >> serialization format changed to RDF/XML-ABBREV - is anyone
>     really using
>      >> RDF/XML anymore as a human-readable format anyway? ;-)
>      >>
>      >> I really appreciate any advice, workarounds, or fixes for this
>     issue. I
>      >> can't really switch back to the earlier Fuseki versions anymore,
>     since
>      >> the new jena-text makes my life so much easier since I no longer
>     have to
>      >> worry about manually reindexing after SPARQL Update, like I did with
>      >> Fuseki and LARQ. Thanks for incorporating jena-text!
>      >>
>      >> Thank you,
>      >> Elli
>      >>
>      >>
>      >>
>      >>> ________________________________
>      >>> From: Elli Schwarz <eliezer_schwarz@yahoo.com
>     <ma...@yahoo.com>>
>      >>> To: "users@jena.apache.org <ma...@jena.apache.org>"
>     <users@jena.apache.org <ma...@jena.apache.org>>
>      >>> Sent: Wednesday, June 26, 2013 9:48 AM
>      >>> Subject: Problem with Fuseki generating RDF/XML
>      >>>
>      >>>
>      >>> Rob,
>      >>>
>      >>> (This email previously had the subject JENA-378 Redux)
>      >>>
>      >>> I think I tracked down the problem with getModel() a bit more.
>     Using
>      >>> s-get, I can get data back as TTL immediately:
>      >>> ./s-get http://localhost:3131/ds/data
>     <http://localhost:3131/ds/data>http://192.168.6.37/graph/uri_data
>      >>>
>      >>>
>      >>> If I modify the s-get script to get results as RDF/XML, then it
>     takes
>      >>> several minutes for Fuseki 0.28-SNAPSHOT to respond.
>      >>>
>      >>> I start Fuseki 0.28 with this command (Fuseki 0.26 is started
>     similarly,
>      >>> but with the config-tdb.ttl assembler):
>      >>> /usr/bin/java -Dlog4j.configuration=log4j.properties -Xmx3200M -jar
>      >>> /opt/jena-2.10/jena-fuseki-0.2.8-SNAPSHOT/fuseki-server.jar
>     --update
>      >>> --config=config-tdb-text.ttl --port=3131
>      >>>
>      >>>
>      >>> If I point the same modified s-get script to the Fuseki 0.26
>     release,
>      >>> the RDF/XML comes back immediately. My guess is that the
>      >>>
>     DatasetAccessorFactory.createHTTP("http://localhost:3131/ds/data").getMod
>      >>> el(modelName) command I use gets data back as RDF/XML, and for some
>      >>> reason Fuseki 0.28 takes a long time to generate RDF/XML. Any
>     ideas as
>      >>> to what changed in the latest version of Fuseki that would
>     cause this
>      >>> problem? Is there any way I can set Fuseki (or the client
>      >>> DatasetAccessor) to use TTL serialization?
>      >>>
>      >>> (BTW, I created JENA-479 for the other bug I discovered with SPARQL
>      >>> Insert scripts.)
>      >>>
>      >>> Thank you very much for your help,
>      >>> Elli
>      >>>
>      >>>
>      >>>
>      >>>> ________________________________
>      >>>> From: Rob Vesse <rvesse@yarcdata.com <ma...@yarcdata.com>>
>      >>>> To: "users@jena.apache.org <ma...@jena.apache.org>"
>     <users@jena.apache.org <ma...@jena.apache.org>>; Elli Schwarz
>      >>>> <eliezer_schwarz@yahoo.com <ma...@yahoo.com>>
>      >>>> Sent: Tuesday, June 25, 2013 4:40 PM
>      >>>> Subject: Re: JENA-378 Redux
>      >>>>
>      >>>>
>      >>>>> I use the older stable jena-core and jena-arq 2.10.0 and
>     jena-fuseki
>      >>>>> 0.2.6
>      >>>>
>      >>>> The current stable releases are jena-core and jena-arq 2.10.1 and
>      >>>> jena-fuseki 0.2.7
>      >>>>
>      >>>> Do you experience the problem with those versions?
>      >>>>
>      >>>> Fuseki config file or arguments used to start would be useful.
>      >>>>
>      >>>> Rob
>      >>>>
>      >>>>
>      >>>> On 6/25/13 1:35 PM, "Elli Schwarz" <eliezer_schwarz@yahoo.com
>     <ma...@yahoo.com>> wrote:
>      >>>>
>      >>>>> This past January, I reported a bug to this list which was
>     recorded as
>      >>>>> JENA-378. I'm now experiencing what appears to be the same
>     problem,
>      >>>>> where
>      >>>>> [ ] syntax in an Insert script doesn't work when using
>      >>>>> UpdateExecutionFactory:
>      >>>>>
>      >>>>>  String updateString = "INSERT {} WHERE { ?x ?p [ ?a  ?b ] }";
>      >>>>>  UpdateRequest update = UpdateFactory.create(updateString);
>      >>>>>
>      >>>>>  UpdateProcessor up = UpdateExecutionFactory.createRemote(update,
>      >>>>>      "http://localhost:3131/ds/update");
>      >>>>>  up.execute();
>      >>>>>
>      >>>>> The error is: 400 Encountered " "?" "? ""
>      >>>>> caused by the client generating incorrect SPARQL with an
>     extra ? (as
>      >>>>> viewed from the Fuseki log):  INSERT { } WHERE  { ?x ?p ??0 .
>     ??0 ?a
>      >>>>> ?b
>      >>>>> }
>      >>>>>
>      >>>>> This is with jena-core & jena-arg  2.10.2-SNAPSHOT, and with
>      >>>>> jena-fuseki
>      >>>>> 0.2.8-SNAPSHOT (compiled today).
>      >>>>> --
>      >>>>> Another problem I'm having which I can't track down is that the
>      >>>>> following
>      >>>>> code takes a VERY long time to execute (10 minutes):
>      >>>>>
>     DatasetAccessorFactory.createHTTP("http://localhost:3131/ds/update").ge
>      >>>>> tMo
>      >>>>> del(modelName);
>      >>>>>
>      >>>>> With earlier versions of Fuseki, it would take seconds, with
>     the same
>      >>>>> data. The problem seems to be related to my Fuseki server
>     instance
>      >>>>> itself, which is 0.2.8-SNAPSHOT (r1496513), and not to my
>     client code,
>      >>>>> since even if I use the older stable jena-core and jena-arq
>     2.10.0 and
>      >>>>> jena-fuseki 0.2.6, I also have the problem (but not if I
>     connect it to
>      >>>>> an
>      >>>>> earlier Fuseki release). Upon debugging, it appears that for some
>      >>>>> reason
>      >>>>> the HTTP request itself is taking a long time to complete. In
>     fact, I'm
>      >>>>> not even getting anything in the Fuseki log for about a
>     minute after
>      >>>>> the
>      >>>>> request is made, but once the request is made I immediately
>     see a spike
>      >>>>> in CPU usage on the server. This doesn't appear to be a
>     network latency
>      >>>>> issue since other access to the server isn't affected, it
>     appears to be
>      >>>>> just this call. It would seem that Fuseki is spinning its
>     wheels on
>      >>>>> something.
>      >>>>>
>      >>>>> I realize this may not be enough info for you to determine
>     what is
>      >>>>> causing the problem, but I don't know how else to track down
>     the issue.
>      >>>>> Using s-get I can get back the data quickly, which is strange
>     since I
>      >>>>> though it would be doing the same thing as the getModel().
>      >>>>>
>      >>>>> Thank you,
>      >>>>> Elli
>      >>>>
>      >>>>
>      >>>>
>      >>>
>      >
>
>
>

Re: Problem with Fuseki generating RDF/XML

Posted by Elli Schwarz <el...@yahoo.com>.

Andy,

As always, I really appreciate your prompt response and fixes. I continue to be amazed at how quickly Jena responds to bugs and even feature requests. And again, jena-text integration is crucial for my project, so I greatly appreciate the integration of this work to replace Fuseki/LARQ.

I tried rebuilding this morning, and yes, efficiency is greatly improved by not using RDF/XML-ABBREV. (It appears that s-get uses Turtle by default now...)

BTW, I'm a big fan of JSON-LD as eclipsing RDF/XML. I currently use jsonld-java for that, and I believe you mentioned to me on that forum that you hope to have that fully integrated into Jena at some point. The biggest selling point for me is that I am able to give my data as JSON-LD to customers and they are able to adapt to use it very easily as regular JSON, without them even knowing that they are actually working with RDF (though I feel a bit guilty about the subterfuge ;-).

-Elli



>________________________________
> From: Andy Seaborne <an...@apache.org>
>To: users@jena.apache.org 
>Sent: Friday, June 28, 2013 6:17 AM
>Subject: Re: Problem with Fuseki generating RDF/XML
> 
>
>Hi there,
>
>I've switched back SPARQL Graph Store protocol GET to use plain RDF/XML.
>
>Details:
>
>The default when using RIOT to write in Lang.RDFXML is to use the pretty 
>form.  i.e. when using RDFDataMgr.write(model,Lang.RDFXML).  RIOT I/O is 
>not automatically used if available.
>
>Fuseki uses new style RDFDataMgr, not model.write so got affetced by the 
>change.
>
>Writing model.write() isn't affected.
>
>Yes - RDF/XML-ABBREV is  expensive.  I'm not completely sure why - the 
>Turtle writer is doing a similar, but not identical, analysis of the 
>model before writing.  However, the RDF/XML-ABBREV writer has more 
>choices and more options to consider.
>
>>> is anyone really using
>>> RDF/XML anymore as a human-readable format anyway?
>
>Absolutely!
>
>But, today, it's the standard.  Tomorrow, it won't be the only choice 
>and I'm guessing that Turtle-only toolkits will emerge.
>
>Next ...
>
>DatasetAccessor:
>
>It does not seem to be setting the accept header at all so it gets the 
>default.  Which is application/rdf+xml.
>
>I've recorded the need to set the accept header to a list based on 
>efficiency as:
>
>https://issues.apache.org/jira/browse/JENA-481
>
>I thinking the order should be N-triples, Turtle, RDF/XML, "whatever you 
>can give me".
>
>For reference, the accept string for reading RDF with 
>RDFDataMgr.loadModel(URL) or model.read(URL) is currently:
>
>text/turtle,application/rdf+xml;q=0.9,application/xml;q=0.8,*/*;q=0.5;
>
>Maybe that should include "application/n-triples" - including the 
>original MIME type of text/plain is distinctly unhelpful.
>
>    Andy
>
>
>On 27/06/13 19:56, Rob Vesse wrote:
>> Andy can probably give you a definitive answer here
>>
>> I know that there were significant improvements to the RDF output
>> infrastructure made in 2.10.1 so my guess is that somehow the default
>> RDF/XML output got switched as part of this upgrade (not necessarily
>> intentionally).
>>
>> If this is the case Andy can likely make the fix easily, I however don't
>> know where to look for this setting.
>>
>> Rob
>>
>>
>> On 6/27/13 11:38 AM, "Elli Schwarz" <el...@yahoo.com> wrote:
>>
>>> I think I may have tracked down what is causing my slow performance of
>>> GET with the new Fuseki 0.28 snapshot. Comparing the output of s-get for
>>> the same data from the latest Fuseki 0.28 snapshot, and from the 0.26
>>> release, I discovered that the 0.28 snapshot is creating the XML in
>>> hierarchical form, with nesting of elements (RDF/XML-ABBREV). In Fuseki
>>> 0.26, it would output the RDF in the regular flattened RDF/XML format.
>>> Obviously, creating the flattened form is much more efficient.
>>>
>>> While I understand that RDF/XML-ABBREV is more human readable, there's a
>>> big price to pay in efficiency, at least for my data. In my case, I'm
>>> accessing my Fuseki endpoint via datasetAccessor.getModel(), and as far
>>> as I know, there's no way for me to tell Fuseki through this API that I
>>> want the data to be serialized as N-TRIPLES (since it's just going to be
>>> loaded in a Jena model anyway and not read by a human). Is there a way I
>>> can control how Fuseki serializes by default? And why was the default
>>> serialization format changed to RDF/XML-ABBREV - is anyone really using
>>> RDF/XML anymore as a human-readable format anyway? ;-)
>>>
>>> I really appreciate any advice, workarounds, or fixes for this issue. I
>>> can't really switch back to the earlier Fuseki versions anymore, since
>>> the new jena-text makes my life so much easier since I no longer have to
>>> worry about manually reindexing after SPARQL Update, like I did with
>>> Fuseki and LARQ. Thanks for incorporating jena-text!
>>>
>>> Thank you,
>>> Elli
>>>
>>>
>>>
>>>> ________________________________
>>>> From: Elli Schwarz <el...@yahoo.com>
>>>> To: "users@jena.apache.org" <us...@jena.apache.org>
>>>> Sent: Wednesday, June 26, 2013 9:48 AM
>>>> Subject: Problem with Fuseki generating RDF/XML
>>>>
>>>>
>>>> Rob,
>>>>
>>>> (This email previously had the subject JENA-378 Redux)
>>>>
>>>> I think I tracked down the problem with getModel() a bit more. Using
>>>> s-get, I can get data back as TTL immediately:
>>>> ./s-get http://localhost:3131/ds/data http://192.168.6.37/graph/uri_data
>>>>
>>>>
>>>> If I modify the s-get script to get results as RDF/XML, then it takes
>>>> several minutes for Fuseki 0.28-SNAPSHOT to respond.
>>>>
>>>> I start Fuseki 0.28 with this command (Fuseki 0.26 is started similarly,
>>>> but with the config-tdb.ttl assembler):
>>>> /usr/bin/java -Dlog4j.configuration=log4j.properties -Xmx3200M -jar
>>>> /opt/jena-2.10/jena-fuseki-0.2.8-SNAPSHOT/fuseki-server.jar --update
>>>> --config=config-tdb-text.ttl --port=3131
>>>>
>>>>
>>>> If I point the same modified s-get script to the Fuseki 0.26 release,
>>>> the RDF/XML comes back immediately. My guess is that the
>>>> DatasetAccessorFactory.createHTTP("http://localhost:3131/ds/data").getMod
>>>> el(modelName) command I use gets data back as RDF/XML, and for some
>>>> reason Fuseki 0.28 takes a long time to generate RDF/XML. Any ideas as
>>>> to what changed in the latest version of Fuseki that would cause this
>>>> problem? Is there any way I can set Fuseki (or the client
>>>> DatasetAccessor) to use TTL serialization?
>>>>
>>>> (BTW, I created JENA-479 for the other bug I discovered with SPARQL
>>>> Insert scripts.)
>>>>
>>>> Thank you very much for your help,
>>>> Elli
>>>>
>>>>
>>>>
>>>>> ________________________________
>>>>> From: Rob Vesse <rv...@yarcdata.com>
>>>>> To: "users@jena.apache.org" <us...@jena.apache.org>; Elli Schwarz
>>>>> <el...@yahoo.com>
>>>>> Sent: Tuesday, June 25, 2013 4:40 PM
>>>>> Subject: Re: JENA-378 Redux
>>>>>
>>>>>
>>>>>> I use the older stable jena-core and jena-arq 2.10.0 and jena-fuseki
>>>>>> 0.2.6
>>>>>
>>>>> The current stable releases are jena-core and jena-arq 2.10.1 and
>>>>> jena-fuseki 0.2.7
>>>>>
>>>>> Do you experience the problem with those versions?
>>>>>
>>>>> Fuseki config file or arguments used to start would be useful.
>>>>>
>>>>> Rob
>>>>>
>>>>>
>>>>> On 6/25/13 1:35 PM, "Elli Schwarz" <el...@yahoo.com> wrote:
>>>>>
>>>>>> This past January, I reported a bug to this list which was recorded as
>>>>>> JENA-378. I'm now experiencing what appears to be the same problem,
>>>>>> where
>>>>>> [ ] syntax in an Insert script doesn't work when using
>>>>>> UpdateExecutionFactory:
>>>>>>
>>>>>>   String updateString = "INSERT {} WHERE { ?x ?p [ ?a  ?b ] }";
>>>>>>   UpdateRequest update = UpdateFactory.create(updateString);
>>>>>>
>>>>>>   UpdateProcessor up = UpdateExecutionFactory.createRemote(update,
>>>>>>       "http://localhost:3131/ds/update");
>>>>>>   up.execute();
>>>>>>
>>>>>> The error is: 400 Encountered " "?" "? ""
>>>>>> caused by the client generating incorrect SPARQL with an extra ? (as
>>>>>> viewed from the Fuseki log):  INSERT { } WHERE   { ?x ?p ??0 . ??0 ?a
>>>>>> ?b
>>>>>> }
>>>>>>
>>>>>> This is with jena-core & jena-arg  2.10.2-SNAPSHOT, and with
>>>>>> jena-fuseki
>>>>>> 0.2.8-SNAPSHOT (compiled today).
>>>>>> --
>>>>>> Another problem I'm having which I can't track down is that the
>>>>>> following
>>>>>> code takes a VERY long time to execute (10 minutes):
>>>>>> DatasetAccessorFactory.createHTTP("http://localhost:3131/ds/update").ge
>>>>>> tMo
>>>>>> del(modelName);
>>>>>>
>>>>>> With earlier versions of Fuseki, it would take seconds, with the same
>>>>>> data. The problem seems to be related to my Fuseki server instance
>>>>>> itself, which is 0.2.8-SNAPSHOT (r1496513), and not to my client code,
>>>>>> since even if I use the older stable jena-core and jena-arq 2.10.0 and
>>>>>> jena-fuseki 0.2.6, I also have the problem (but not if I connect it to
>>>>>> an
>>>>>> earlier Fuseki release). Upon debugging, it appears that for some
>>>>>> reason
>>>>>> the HTTP request itself is taking a long time to complete. In fact, I'm
>>>>>> not even getting anything in the Fuseki log for about a minute after
>>>>>> the
>>>>>> request is made, but once the request is made I immediately see a spike
>>>>>> in CPU usage on the server. This doesn't appear to be a network latency
>>>>>> issue since other access to the server isn't affected, it appears to be
>>>>>> just this call. It would seem that Fuseki is spinning its wheels on
>>>>>> something.
>>>>>>
>>>>>> I realize this may not be enough info for you to determine what is
>>>>>> causing the problem, but I don't know how else to track down the issue.
>>>>>> Using s-get I can get back the data quickly, which is strange since I
>>>>>> though it would be doing the same thing as the getModel().
>>>>>>
>>>>>> Thank you,
>>>>>> Elli
>>>>>
>>>>>
>>>>>
>>>>
>>
>
>
>
>

Re: Problem with Fuseki generating RDF/XML

Posted by Andy Seaborne <an...@apache.org>.

Hi there,

I've switched back SPARQL Graph Store protocol GET to use plain RDF/XML.

Details:

The default when using RIOT to write in Lang.RDFXML is to use the pretty 
form.  i.e. when using RDFDataMgr.write(model,Lang.RDFXML).  RIOT I/O is 
not automatically used if available.

Fuseki uses new style RDFDataMgr, not model.write so got affetced by the 
change.

Writing model.write() isn't affected.

Yes - RDF/XML-ABBREV is  expensive.  I'm not completely sure why - the 
Turtle writer is doing a similar, but not identical, analysis of the 
model before writing.  However, the RDF/XML-ABBREV writer has more 
choices and more options to consider.

 >> is anyone really using
 >> RDF/XML anymore as a human-readable format anyway?

Absolutely!

But, today, it's the standard.  Tomorrow, it won't be the only choice 
and I'm guessing that Turtle-only toolkits will emerge.

Next ...

DatasetAccessor:

It does not seem to be setting the accept header at all so it gets the 
default.  Which is application/rdf+xml.

I've recorded the need to set the accept header to a list based on 
efficiency as:

https://issues.apache.org/jira/browse/JENA-481

I thinking the order should be N-triples, Turtle, RDF/XML, "whatever you 
can give me".

For reference, the accept string for reading RDF with 
RDFDataMgr.loadModel(URL) or model.read(URL) is currently:

text/turtle,application/rdf+xml;q=0.9,application/xml;q=0.8,*/*;q=0.5;

Maybe that should include "application/n-triples" - including the 
original MIME type of text/plain is distinctly unhelpful.

	Andy


On 27/06/13 19:56, Rob Vesse wrote:
> Andy can probably give you a definitive answer here
>
> I know that there were significant improvements to the RDF output
> infrastructure made in 2.10.1 so my guess is that somehow the default
> RDF/XML output got switched as part of this upgrade (not necessarily
> intentionally).
>
> If this is the case Andy can likely make the fix easily, I however don't
> know where to look for this setting.
>
> Rob
>
>
> On 6/27/13 11:38 AM, "Elli Schwarz" <el...@yahoo.com> wrote:
>
>> I think I may have tracked down what is causing my slow performance of
>> GET with the new Fuseki 0.28 snapshot. Comparing the output of s-get for
>> the same data from the latest Fuseki 0.28 snapshot, and from the 0.26
>> release, I discovered that the 0.28 snapshot is creating the XML in
>> hierarchical form, with nesting of elements (RDF/XML-ABBREV). In Fuseki
>> 0.26, it would output the RDF in the regular flattened RDF/XML format.
>> Obviously, creating the flattened form is much more efficient.
>>
>> While I understand that RDF/XML-ABBREV is more human readable, there's a
>> big price to pay in efficiency, at least for my data. In my case, I'm
>> accessing my Fuseki endpoint via datasetAccessor.getModel(), and as far
>> as I know, there's no way for me to tell Fuseki through this API that I
>> want the data to be serialized as N-TRIPLES (since it's just going to be
>> loaded in a Jena model anyway and not read by a human). Is there a way I
>> can control how Fuseki serializes by default? And why was the default
>> serialization format changed to RDF/XML-ABBREV - is anyone really using
>> RDF/XML anymore as a human-readable format anyway? ;-)
>>
>> I really appreciate any advice, workarounds, or fixes for this issue. I
>> can't really switch back to the earlier Fuseki versions anymore, since
>> the new jena-text makes my life so much easier since I no longer have to
>> worry about manually reindexing after SPARQL Update, like I did with
>> Fuseki and LARQ. Thanks for incorporating jena-text!
>>
>> Thank you,
>> Elli
>>
>>
>>
>>> ________________________________
>>> From: Elli Schwarz <el...@yahoo.com>
>>> To: "users@jena.apache.org" <us...@jena.apache.org>
>>> Sent: Wednesday, June 26, 2013 9:48 AM
>>> Subject: Problem with Fuseki generating RDF/XML
>>>
>>>
>>> Rob,
>>>
>>> (This email previously had the subject JENA-378 Redux)
>>>
>>> I think I tracked down the problem with getModel() a bit more. Using
>>> s-get, I can get data back as TTL immediately:
>>> ./s-get http://localhost:3131/ds/data http://192.168.6.37/graph/uri_data
>>>
>>>
>>> If I modify the s-get script to get results as RDF/XML, then it takes
>>> several minutes for Fuseki 0.28-SNAPSHOT to respond.
>>>
>>> I start Fuseki 0.28 with this command (Fuseki 0.26 is started similarly,
>>> but with the config-tdb.ttl assembler):
>>> /usr/bin/java -Dlog4j.configuration=log4j.properties -Xmx3200M -jar
>>> /opt/jena-2.10/jena-fuseki-0.2.8-SNAPSHOT/fuseki-server.jar --update
>>> --config=config-tdb-text.ttl --port=3131
>>>
>>>
>>> If I point the same modified s-get script to the Fuseki 0.26 release,
>>> the RDF/XML comes back immediately. My guess is that the
>>> DatasetAccessorFactory.createHTTP("http://localhost:3131/ds/data").getMod
>>> el(modelName) command I use gets data back as RDF/XML, and for some
>>> reason Fuseki 0.28 takes a long time to generate RDF/XML. Any ideas as
>>> to what changed in the latest version of Fuseki that would cause this
>>> problem? Is there any way I can set Fuseki (or the client
>>> DatasetAccessor) to use TTL serialization?
>>>
>>> (BTW, I created JENA-479 for the other bug I discovered with SPARQL
>>> Insert scripts.)
>>>
>>> Thank you very much for your help,
>>> Elli
>>>
>>>
>>>
>>>> ________________________________
>>>> From: Rob Vesse <rv...@yarcdata.com>
>>>> To: "users@jena.apache.org" <us...@jena.apache.org>; Elli Schwarz
>>>> <el...@yahoo.com>
>>>> Sent: Tuesday, June 25, 2013 4:40 PM
>>>> Subject: Re: JENA-378 Redux
>>>>
>>>>
>>>>> I use the older stable jena-core and jena-arq 2.10.0 and jena-fuseki
>>>>> 0.2.6
>>>>
>>>> The current stable releases are jena-core and jena-arq 2.10.1 and
>>>> jena-fuseki 0.2.7
>>>>
>>>> Do you experience the problem with those versions?
>>>>
>>>> Fuseki config file or arguments used to start would be useful.
>>>>
>>>> Rob
>>>>
>>>>
>>>> On 6/25/13 1:35 PM, "Elli Schwarz" <el...@yahoo.com> wrote:
>>>>
>>>>> This past January, I reported a bug to this list which was recorded as
>>>>> JENA-378. I'm now experiencing what appears to be the same problem,
>>>>> where
>>>>> [ ] syntax in an Insert script doesn't work when using
>>>>> UpdateExecutionFactory:
>>>>>
>>>>>   String updateString = "INSERT {} WHERE { ?x ?p [ ?a  ?b ] }";
>>>>>   UpdateRequest update = UpdateFactory.create(updateString);
>>>>>
>>>>>   UpdateProcessor up = UpdateExecutionFactory.createRemote(update,
>>>>>       "http://localhost:3131/ds/update");
>>>>>   up.execute();
>>>>>
>>>>> The error is: 400 Encountered " "?" "? ""
>>>>> caused by the client generating incorrect SPARQL with an extra ? (as
>>>>> viewed from the Fuseki log):  INSERT { } WHERE   { ?x ?p ??0 . ??0 ?a
>>>>> ?b
>>>>> }
>>>>>
>>>>> This is with jena-core & jena-arg  2.10.2-SNAPSHOT, and with
>>>>> jena-fuseki
>>>>> 0.2.8-SNAPSHOT (compiled today).
>>>>> --
>>>>> Another problem I'm having which I can't track down is that the
>>>>> following
>>>>> code takes a VERY long time to execute (10 minutes):
>>>>> DatasetAccessorFactory.createHTTP("http://localhost:3131/ds/update").ge
>>>>> tMo
>>>>> del(modelName);
>>>>>
>>>>> With earlier versions of Fuseki, it would take seconds, with the same
>>>>> data. The problem seems to be related to my Fuseki server instance
>>>>> itself, which is 0.2.8-SNAPSHOT (r1496513), and not to my client code,
>>>>> since even if I use the older stable jena-core and jena-arq 2.10.0 and
>>>>> jena-fuseki 0.2.6, I also have the problem (but not if I connect it to
>>>>> an
>>>>> earlier Fuseki release). Upon debugging, it appears that for some
>>>>> reason
>>>>> the HTTP request itself is taking a long time to complete. In fact, I'm
>>>>> not even getting anything in the Fuseki log for about a minute after
>>>>> the
>>>>> request is made, but once the request is made I immediately see a spike
>>>>> in CPU usage on the server. This doesn't appear to be a network latency
>>>>> issue since other access to the server isn't affected, it appears to be
>>>>> just this call. It would seem that Fuseki is spinning its wheels on
>>>>> something.
>>>>>
>>>>> I realize this may not be enough info for you to determine what is
>>>>> causing the problem, but I don't know how else to track down the issue.
>>>>> Using s-get I can get back the data quickly, which is strange since I
>>>>> though it would be doing the same thing as the getModel().
>>>>>
>>>>> Thank you,
>>>>> Elli
>>>>
>>>>
>>>>
>>>
>

Re: Problem with Fuseki generating RDF/XML

Posted by Rob Vesse <rv...@yarcdata.com>.

Andy can probably give you a definitive answer here

I know that there were significant improvements to the RDF output
infrastructure made in 2.10.1 so my guess is that somehow the default
RDF/XML output got switched as part of this upgrade (not necessarily
intentionally).

If this is the case Andy can likely make the fix easily, I however don't
know where to look for this setting.

Rob


On 6/27/13 11:38 AM, "Elli Schwarz" <el...@yahoo.com> wrote:

>I think I may have tracked down what is causing my slow performance of
>GET with the new Fuseki 0.28 snapshot. Comparing the output of s-get for
>the same data from the latest Fuseki 0.28 snapshot, and from the 0.26
>release, I discovered that the 0.28 snapshot is creating the XML in
>hierarchical form, with nesting of elements (RDF/XML-ABBREV). In Fuseki
>0.26, it would output the RDF in the regular flattened RDF/XML format.
>Obviously, creating the flattened form is much more efficient.
>
>While I understand that RDF/XML-ABBREV is more human readable, there's a
>big price to pay in efficiency, at least for my data. In my case, I'm
>accessing my Fuseki endpoint via datasetAccessor.getModel(), and as far
>as I know, there's no way for me to tell Fuseki through this API that I
>want the data to be serialized as N-TRIPLES (since it's just going to be
>loaded in a Jena model anyway and not read by a human). Is there a way I
>can control how Fuseki serializes by default? And why was the default
>serialization format changed to RDF/XML-ABBREV - is anyone really using
>RDF/XML anymore as a human-readable format anyway? ;-)
>
>I really appreciate any advice, workarounds, or fixes for this issue. I
>can't really switch back to the earlier Fuseki versions anymore, since
>the new jena-text makes my life so much easier since I no longer have to
>worry about manually reindexing after SPARQL Update, like I did with
>Fuseki and LARQ. Thanks for incorporating jena-text!
>
>Thank you,
>Elli
>
>
>
>>________________________________
>> From: Elli Schwarz <el...@yahoo.com>
>>To: "users@jena.apache.org" <us...@jena.apache.org>
>>Sent: Wednesday, June 26, 2013 9:48 AM
>>Subject: Problem with Fuseki generating RDF/XML
>> 
>>
>>Rob,
>>
>>(This email previously had the subject JENA-378 Redux)
>>
>>I think I tracked down the problem with getModel() a bit more. Using
>>s-get, I can get data back as TTL immediately:
>>./s-get http://localhost:3131/ds/data http://192.168.6.37/graph/uri_data
>>
>>
>>If I modify the s-get script to get results as RDF/XML, then it takes
>>several minutes for Fuseki 0.28-SNAPSHOT to respond.
>>
>>I start Fuseki 0.28 with this command (Fuseki 0.26 is started similarly,
>>but with the config-tdb.ttl assembler):
>>/usr/bin/java -Dlog4j.configuration=log4j.properties -Xmx3200M -jar
>>/opt/jena-2.10/jena-fuseki-0.2.8-SNAPSHOT/fuseki-server.jar --update
>>--config=config-tdb-text.ttl --port=3131
>>
>>
>>If I point the same modified s-get script to the Fuseki 0.26 release,
>>the RDF/XML comes back immediately. My guess is that the
>>DatasetAccessorFactory.createHTTP("http://localhost:3131/ds/data").getMod
>>el(modelName) command I use gets data back as RDF/XML, and for some
>>reason Fuseki 0.28 takes a long time to generate RDF/XML. Any ideas as
>>to what changed in the latest version of Fuseki that would cause this
>>problem? Is there any way I can set Fuseki (or the client
>>DatasetAccessor) to use TTL serialization?
>>
>>(BTW, I created JENA-479 for the other bug I discovered with SPARQL
>>Insert scripts.)
>>
>>Thank you very much for your help,
>>Elli
>>
>>
>>
>>>________________________________
>>> From: Rob Vesse <rv...@yarcdata.com>
>>>To: "users@jena.apache.org" <us...@jena.apache.org>; Elli Schwarz
>>><el...@yahoo.com>
>>>Sent: Tuesday, June 25, 2013 4:40 PM
>>>Subject: Re: JENA-378 Redux
>>> 
>>>
>>>> I use the older stable jena-core and jena-arq 2.10.0 and jena-fuseki
>>>>0.2.6
>>>
>>>The current stable releases are jena-core and jena-arq 2.10.1 and
>>>jena-fuseki 0.2.7
>>>
>>>Do you experience the problem with those versions?
>>>
>>>Fuseki config file or arguments used to start would be useful.
>>>
>>>Rob
>>>
>>>
>>>On 6/25/13 1:35 PM, "Elli Schwarz" <el...@yahoo.com> wrote:
>>>
>>>>This past January, I reported a bug to this list which was recorded as
>>>>JENA-378. I'm now experiencing what appears to be the same problem,
>>>>where
>>>>[ ] syntax in an Insert script doesn't work when using
>>>>UpdateExecutionFactory:
>>>>
>>>>  String updateString = "INSERT {} WHERE { ?x ?p [ ?a  ?b ] }";
>>>>  UpdateRequest update = UpdateFactory.create(updateString);
>>>>
>>>>  UpdateProcessor up = UpdateExecutionFactory.createRemote(update,
>>>>      "http://localhost:3131/ds/update");
>>>>  up.execute();
>>>>
>>>>The error is: 400 Encountered " "?" "? ""
>>>>caused by the client generating incorrect SPARQL with an extra ? (as
>>>>viewed from the Fuseki log):  INSERT { } WHERE   { ?x ?p ??0 . ??0 ?a
>>>>?b
>>>> } 
>>>>
>>>>This is with jena-core & jena-arg  2.10.2-SNAPSHOT, and with
>>>>jena-fuseki
>>>>0.2.8-SNAPSHOT (compiled today).
>>>>--
>>>>Another problem I'm having which I can't track down is that the
>>>>following
>>>>code takes a VERY long time to execute (10 minutes):
>>>>DatasetAccessorFactory.createHTTP("http://localhost:3131/ds/update").ge
>>>>tMo
>>>>del(modelName);
>>>>
>>>>With earlier versions of Fuseki, it would take seconds, with the same
>>>>data. The problem seems to be related to my Fuseki server instance
>>>>itself, which is 0.2.8-SNAPSHOT (r1496513), and not to my client code,
>>>>since even if I use the older stable jena-core and jena-arq 2.10.0 and
>>>>jena-fuseki 0.2.6, I also have the problem (but not if I connect it to
>>>>an
>>>>earlier Fuseki release). Upon debugging, it appears that for some
>>>>reason
>>>>the HTTP request itself is taking a long time to complete. In fact, I'm
>>>>not even getting anything in the Fuseki log for about a minute after
>>>>the
>>>>request is made, but once the request is made I immediately see a spike
>>>>in CPU usage on the server. This doesn't appear to be a network latency
>>>>issue since other access to the server isn't affected, it appears to be
>>>>just this call. It would seem that Fuseki is spinning its wheels on
>>>>something. 
>>>>
>>>>I realize this may not be enough info for you to determine what is
>>>>causing the problem, but I don't know how else to track down the issue.
>>>>Using s-get I can get back the data quickly, which is strange since I
>>>>though it would be doing the same thing as the getModel().
>>>>
>>>>Thank you,
>>>>Elli
>>>
>>>
>>>
>>

Re: Problem with Fuseki generating RDF/XML

Posted by Elli Schwarz <el...@yahoo.com>.

I think I may have tracked down what is causing my slow performance of GET with the new Fuseki 0.28 snapshot. Comparing the output of s-get for the same data from the latest Fuseki 0.28 snapshot, and from the 0.26 release, I discovered that the 0.28 snapshot is creating the XML in hierarchical form, with nesting of elements (RDF/XML-ABBREV). In Fuseki 0.26, it would output the RDF in the regular flattened RDF/XML format. Obviously, creating the flattened form is much more efficient.

While I understand that RDF/XML-ABBREV is more human readable, there's a big price to pay in efficiency, at least for my data. In my case, I'm accessing my Fuseki endpoint via datasetAccessor.getModel(), and as far as I know, there's no way for me to tell Fuseki through this API that I want the data to be serialized as N-TRIPLES (since it's just going to be loaded in a Jena model anyway and not read by a human). Is there a way I can control how Fuseki serializes by default? And why was the default serialization format changed to RDF/XML-ABBREV - is anyone really using RDF/XML anymore as a human-readable format anyway? ;-)

I really appreciate any advice, workarounds, or fixes for this issue. I can't really switch back to the earlier Fuseki versions anymore, since the new jena-text makes my life so much easier since I no longer have to worry about manually reindexing after SPARQL Update, like I did with Fuseki and LARQ. Thanks for incorporating jena-text!

Thank you,
Elli



>________________________________
> From: Elli Schwarz <el...@yahoo.com>
>To: "users@jena.apache.org" <us...@jena.apache.org> 
>Sent: Wednesday, June 26, 2013 9:48 AM
>Subject: Problem with Fuseki generating RDF/XML
> 
>
>Rob,
>
>(This email previously had the subject JENA-378 Redux) 
>
>I think I tracked down the problem with getModel() a bit more. Using s-get, I can get data back as TTL immediately:
>./s-get http://localhost:3131/ds/data http://192.168.6.37/graph/uri_data
>
>
>If I modify the s-get script to get results as RDF/XML, then it takes several minutes for Fuseki 0.28-SNAPSHOT to respond.
>
>I start Fuseki 0.28 with this command (Fuseki 0.26 is started similarly, but with the config-tdb.ttl assembler):
>/usr/bin/java -Dlog4j.configuration=log4j.properties -Xmx3200M -jar /opt/jena-2.10/jena-fuseki-0.2.8-SNAPSHOT/fuseki-server.jar --update --config=config-tdb-text.ttl --port=3131
>
>
>If I point the same modified s-get script to the Fuseki 0.26 release, the RDF/XML comes back immediately. My guess is that the DatasetAccessorFactory.createHTTP("http://localhost:3131/ds/data").getModel(modelName) command I use gets data back as RDF/XML, and for some reason Fuseki 0.28 takes a long time to generate RDF/XML. Any ideas as to what changed in the latest version of Fuseki that would cause this problem? Is there any way I can set Fuseki (or the client DatasetAccessor) to use TTL serialization?
>
>(BTW, I created JENA-479 for the other bug I discovered with SPARQL Insert scripts.)
>
>Thank you very much for your help,
>Elli
>
>
>
>>________________________________
>> From: Rob Vesse <rv...@yarcdata.com>
>>To: "users@jena.apache.org" <us...@jena.apache.org>; Elli Schwarz <el...@yahoo.com> 
>>Sent: Tuesday, June 25, 2013 4:40 PM
>>Subject: Re: JENA-378 Redux
>> 
>>
>>> I use the older stable jena-core and jena-arq 2.10.0 and jena-fuseki
>>>0.2.6
>>
>>The current stable releases are jena-core and jena-arq 2.10.1 and
>>jena-fuseki 0.2.7
>>
>>Do you experience the problem with those versions?
>>
>>Fuseki config file or arguments used to start would be useful.
>>
>>Rob
>>
>>
>>On 6/25/13 1:35 PM, "Elli Schwarz" <el...@yahoo.com> wrote:
>>
>>>This past January, I reported a bug to this list which was recorded as
>>>JENA-378. I'm now experiencing what appears to be the same problem, where
>>>[ ] syntax in an Insert script doesn't work when using
>>>UpdateExecutionFactory:
>>>
>>>  String updateString = "INSERT {} WHERE { ?x ?p [ ?a  ?b ] }";
>>>  UpdateRequest update = UpdateFactory.create(updateString);
>>>
>>>  UpdateProcessor up = UpdateExecutionFactory.createRemote(update,
>>>      "http://localhost:3131/ds/update");
>>>  up.execute();
>>>
>>>The error is: 400 Encountered " "?" "? ""
>>>caused by the client generating incorrect SPARQL with an extra ? (as
>>>viewed from the Fuseki log):  INSERT { } WHERE   { ?x ?p ??0 . ??0 ?a ?b
>>> } 
>>>
>>>This is with jena-core & jena-arg  2.10.2-SNAPSHOT, and with jena-fuseki
>>>0.2.8-SNAPSHOT (compiled today).
>>>--
>>>Another problem I'm having which I can't track down is that the following
>>>code takes a VERY long time to execute (10 minutes):
>>>DatasetAccessorFactory.createHTTP("http://localhost:3131/ds/update").getMo
>>>del(modelName);
>>>
>>>With earlier versions of Fuseki, it would take seconds, with the same
>>>data. The problem seems to be related to my Fuseki server instance
>>>itself, which is 0.2.8-SNAPSHOT (r1496513), and not to my client code,
>>>since even if I use the older stable jena-core and jena-arq 2.10.0 and
>>>jena-fuseki 0.2.6, I also have the problem (but not if I connect it to an
>>>earlier Fuseki release). Upon debugging, it appears that for some reason
>>>the HTTP request itself is taking a long time to complete. In fact, I'm
>>>not even getting anything in the Fuseki log for about a minute after the
>>>request is made, but once the request is made I immediately see a spike
>>>in CPU usage on the server. This doesn't appear to be a network latency
>>>issue since other access to the server isn't affected, it appears to be
>>>just this call. It would seem that Fuseki is spinning its wheels on
>>>something. 
>>>
>>>I realize this may not be enough info for you to determine what is
>>>causing the problem, but I don't know how else to track down the issue.
>>>Using s-get I can get back the data quickly, which is strange since I
>>>though it would be doing the same thing as the getModel().
>>>
>>>Thank you,
>>>Elli
>>
>>
>>
>
>

Re: Problem with Fuseki generating RDF/XML

Posted by Osma Suominen <os...@aalto.fi>.

Hi Elli!

27.06.2013 21:38, Elli Schwarz kirjoitti:

> While I understand that RDF/XML-ABBREV is more human readable,
> there's a big price to pay in efficiency, at least for my data. In my
> case, I'm accessing my Fuseki endpoint via
> datasetAccessor.getModel(), and as far as I know, there's no way for
> me to tell Fuseki through this API that I want the data to be
> serialized as N-TRIPLES (since it's just going to be loaded in a Jena
> model anyway and not read by a human). Is there a way I can control
> how Fuseki serializes by default? And why was the default
> serialization format changed to RDF/XML-ABBREV - is anyone really
> using RDF/XML anymore as a human-readable format anyway? ;-)

Fuseki respects the Accept header in the SPARQL HTTP Graph Store 
protocol. So if you're able to inject an Accept header into the request, 
you can tell Fuseki to output e.g. N-Triples.

I have Fuseki behind a Varnish HTTP proxy and I've set it up so that 
e.g. /mydataset/data.ttl will be rerouted by Varnish into 
/mydataset/data with the Accept header set to request Turtle. If you can 
put some HTTP proxy in front of Fuseki, you could do the same.

I think I've seen some SPARQL HTTP Graph Store implementation support a 
custom URL parameter that can be used to set the MIME type instead of 
using an Accept header, but I can't recall which one. If Fuseki 
supported something like that, it would be easy to request a particular 
serialization format just by using a special URL.

Hope this helps

-Osma

-- 
Osma Suominen | Osma.Suominen@aalto.fi | +358 40 5255 882
Aalto University, Department of Media Technology, Semantic Computing
Research Group
Room 2541, Otaniementie 17, Espoo, Finland; P.O. Box 15500, FI-00076
Aalto, Finland

Problem with Fuseki generating RDF/XML

Posted by Elli Schwarz <el...@yahoo.com>.

Rob,

(This email previously had the subject JENA-378 Redux) 

I think I tracked down the problem with getModel() a bit more. Using s-get, I can get data back as TTL immediately:
./s-get http://localhost:3131/ds/data http://192.168.6.37/graph/uri_data


If I modify the s-get script to get results as RDF/XML, then it takes several minutes for Fuseki 0.28-SNAPSHOT to respond.

I start Fuseki 0.28 with this command (Fuseki 0.26 is started similarly, but with the config-tdb.ttl assembler):
/usr/bin/java -Dlog4j.configuration=log4j.properties -Xmx3200M -jar /opt/jena-2.10/jena-fuseki-0.2.8-SNAPSHOT/fuseki-server.jar --update --config=config-tdb-text.ttl --port=3131


If I point the same modified s-get script to the Fuseki 0.26 release, the RDF/XML comes back immediately. My guess is that the DatasetAccessorFactory.createHTTP("http://localhost:3131/ds/data").getModel(modelName) command I use gets data back as RDF/XML, and for some reason Fuseki 0.28 takes a long time to generate RDF/XML. Any ideas as to what changed in the latest version of Fuseki that would cause this problem? Is there any way I can set Fuseki (or the client DatasetAccessor) to use TTL serialization?

(BTW, I created JENA-479 for the other bug I discovered with SPARQL Insert scripts.)

Thank you very much for your help,
Elli



>________________________________
> From: Rob Vesse <rv...@yarcdata.com>
>To: "users@jena.apache.org" <us...@jena.apache.org>; Elli Schwarz <el...@yahoo.com> 
>Sent: Tuesday, June 25, 2013 4:40 PM
>Subject: Re: JENA-378 Redux
> 
>
>> I use the older stable jena-core and jena-arq 2.10.0 and jena-fuseki
>>0.2.6
>
>The current stable releases are jena-core and jena-arq 2.10.1 and
>jena-fuseki 0.2.7
>
>Do you experience the problem with those versions?
>
>Fuseki config file or arguments used to start would be useful.
>
>Rob
>
>
>On 6/25/13 1:35 PM, "Elli Schwarz" <el...@yahoo.com> wrote:
>
>>This past January, I reported a bug to this list which was recorded as
>>JENA-378. I'm now experiencing what appears to be the same problem, where
>>[ ] syntax in an Insert script doesn't work when using
>>UpdateExecutionFactory:
>>
>>  String updateString = "INSERT {} WHERE { ?x ?p [ ?a  ?b ] }";
>>  UpdateRequest update = UpdateFactory.create(updateString);
>>
>>  UpdateProcessor up = UpdateExecutionFactory.createRemote(update,
>>      "http://localhost:3131/ds/update");
>>  up.execute();
>>
>>The error is: 400 Encountered " "?" "? ""
>>caused by the client generating incorrect SPARQL with an extra ? (as
>>viewed from the Fuseki log):  INSERT { } WHERE   { ?x ?p ??0 . ??0 ?a ?b
>> } 
>>
>>This is with jena-core & jena-arg  2.10.2-SNAPSHOT, and with jena-fuseki
>>0.2.8-SNAPSHOT (compiled today).
>>--
>>Another problem I'm having which I can't track down is that the following
>>code takes a VERY long time to execute (10 minutes):
>>DatasetAccessorFactory.createHTTP("http://localhost:3131/ds/update").getMo
>>del(modelName);
>>
>>With earlier versions of Fuseki, it would take seconds, with the same
>>data. The problem seems to be related to my Fuseki server instance
>>itself, which is 0.2.8-SNAPSHOT (r1496513), and not to my client code,
>>since even if I use the older stable jena-core and jena-arq 2.10.0 and
>>jena-fuseki 0.2.6, I also have the problem (but not if I connect it to an
>>earlier Fuseki release). Upon debugging, it appears that for some reason
>>the HTTP request itself is taking a long time to complete. In fact, I'm
>>not even getting anything in the Fuseki log for about a minute after the
>>request is made, but once the request is made I immediately see a spike
>>in CPU usage on the server. This doesn't appear to be a network latency
>>issue since other access to the server isn't affected, it appears to be
>>just this call. It would seem that Fuseki is spinning its wheels on
>>something. 
>>
>>I realize this may not be enough info for you to determine what is
>>causing the problem, but I don't know how else to track down the issue.
>>Using s-get I can get back the data quickly, which is strange since I
>>though it would be doing the same thing as the getModel().
>>
>>Thank you,
>>Elli
>
>
>

Re: JENA-378 Redux

Posted by Rob Vesse <rv...@yarcdata.com>.

> I use the older stable jena-core and jena-arq 2.10.0 and jena-fuseki
>0.2.6

The current stable releases are jena-core and jena-arq 2.10.1 and
jena-fuseki 0.2.7

Do you experience the problem with those versions?

Fuseki config file or arguments used to start would be useful.

Rob


On 6/25/13 1:35 PM, "Elli Schwarz" <el...@yahoo.com> wrote:

>This past January, I reported a bug to this list which was recorded as
>JENA-378. I'm now experiencing what appears to be the same problem, where
>[ ] syntax in an Insert script doesn't work when using
>UpdateExecutionFactory:
>
>  String updateString = "INSERT {} WHERE { ?x ?p [ ?a  ?b ] }";
>  UpdateRequest update = UpdateFactory.create(updateString);
>
>  UpdateProcessor up = UpdateExecutionFactory.createRemote(update,
>      "http://localhost:3131/ds/update");
>  up.execute();
>
>The error is: 400 Encountered " "?" "? ""
>caused by the client generating incorrect SPARQL with an extra ? (as
>viewed from the Fuseki log):  INSERT { } WHERE   { ?x ?p ??0 . ??0 ?a ?b
> } 
>
>This is with jena-core & jena-arg  2.10.2-SNAPSHOT, and with jena-fuseki
>0.2.8-SNAPSHOT (compiled today).
>--
>Another problem I'm having which I can't track down is that the following
>code takes a VERY long time to execute (10 minutes):
>DatasetAccessorFactory.createHTTP("http://localhost:3131/ds/update").getMo
>del(modelName);
>
>With earlier versions of Fuseki, it would take seconds, with the same
>data. The problem seems to be related to my Fuseki server instance
>itself, which is 0.2.8-SNAPSHOT (r1496513), and not to my client code,
>since even if I use the older stable jena-core and jena-arq 2.10.0 and
>jena-fuseki 0.2.6, I also have the problem (but not if I connect it to an
>earlier Fuseki release). Upon debugging, it appears that for some reason
>the HTTP request itself is taking a long time to complete. In fact, I'm
>not even getting anything in the Fuseki log for about a minute after the
>request is made, but once the request is made I immediately see a spike
>in CPU usage on the server. This doesn't appear to be a network latency
>issue since other access to the server isn't affected, it appears to be
>just this call. It would seem that Fuseki is spinning its wheels on
>something. 
>
>I realize this may not be enough info for you to determine what is
>causing the problem, but I don't know how else to track down the issue.
>Using s-get I can get back the data quickly, which is strange since I
>though it would be doing the same thing as the getModel().
>
>Thank you,
>Elli