You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jena.apache.org by Stefano Cossu <sc...@artic.edu> on 2017/12/24 22:10:04 UTC
Python bindings?
Hello,
I am writing a LDP server using Python's RDFlib and Fuseki/TDB as a back
end store.
Right now my application is very slow, I suspect due to the HTTP
overhead: profiling shows a large chunk of time waiting for sockets.
Is there a reliable way to write Python code against the Fuseki Java
API? I understand that Fuseki is written in Java and there are no native
Python bindings. I have looked at options such as Jython, Jpype and
PyJnius but I am wondering how reliable these options are. Any suggestions?
Thanks,
Stefano
Re: Python bindings?
Posted by Andy Seaborne <an...@apache.org>.
On 28/12/17 04:43, Stefano Cossu wrote:
> Hi Andy,
>
> By doing a straight POST on Fuseki/TDB/Jetty, on a quad-core i7 laptop
> with 12Gb RAM:
>
> time curl -i --data-binary 'WITH <info:graph/__root__> DELETE {}
> INSERT{<info:s#1> <info:p#1> <info:o#1> . } WHERE {}'
> -H'Content-Type:application/sparql-update'
> 'http://localhost:3030/lakesuperior-dev/update'
> HTTP/1.1 204 No Content
> Date: Thu, 28 Dec 2017 04:34:56 GMT
> Fuseki-Request-ID: 12
>
>
> real 0m0.144s
> user 0m0.010s
> sys 0m0.002s
Repeated calls will be faster because when curl is invoked like that,
the connection is created each time. Max HTTP overhead.
That operation is the same as doing:
INSERT DATA {}
>
>
> Fuseki log:
>
> [2017-12-27 22:34:56] Fuseki INFO [12] POST
> http://localhost:3030/lakesuperior-dev/update
> [2017-12-27 22:34:56] Fuseki INFO [12] POST /lakesuperior-dev ::
> 'update' :: [application/sparql-update] ?
> [2017-12-27 22:34:56] Fuseki INFO [12] 204 No Content (125 ms)
>
>
> There is some HTTP overhead indeed but as you suggest it seems to be
> mostly Fuseki doing work. The time for sending the same request goes
> from 120 to 189ms. Would you consider this normal and should I settle on
> it?
When sending from Python?
Sorry - I have no experience using python with or without rdflib where
this type of performance matters.
>
> This is important for me because so far I have bundled more complex
> requests in one SPARQL update or query request to avoid the HTTP tax,
> but if that were less severe than having Fuseki parse one complex query
> I could rethink my application code.
A sequence of SPARQL Update requests can be sent in one request by using
";" between them.
>
> Thanks,
> Stefano
>
>
>
> On 12/26/2017 11:50 AM, Andy Seaborne wrote:
>>> I suspect due to the HTTP overhead: profiling shows a large chunk of
>>> time waiting for sockets.
>>
>> If it waiting, then either it is because Fuseki is doing work (see the
>> log file which has entries at start and end of an operation), or the
>> client is waiting (maybe connection management issues?).
>>
>> Fuseki does keep the connection open (connection caching). If log
>> looks correct, how long is the client waiting?
>>
>> Andy
>>
>> On 26/12/17 03:25, Stefano Cossu wrote:
>>> Dick,
>>> I am interested in hearing the reasons behind your developers
>>> dropping RDFLib, which I find very convenient for de/serializing RDF
>>> but I feel like it is somewhat brittle and quite obscure in the back
>>> end connection part. I think that your approach to using straight
>>> HTTP calls for that may be a better choice.
>>>
>>> Also, thanks for the tip on Thrift. I am not familiar with it but I
>>> would be interested in knowing how your team is building Python
>>> bindings for the Jena API if it is meant to become a public project
>>> at some point.
>>>
>>> Best,
>>> Stefano
>>>
>>>
>>> On 12/24/2017 04:33 PM, dandh988 wrote:
>>>> We use Python against Jena/Fuseki/CustomHTTP and find direct SPARQL
>>>> against the endpoint to be "fast". The Python Devs dropped using the
>>>> RDFLib.
>>>> We also have a Thirft connection in development which is proving
>>>> useful for low level Jena API access.
>>>>
>>>> Dick
>>>> -------- Original message --------From: Stefano Cossu
>>>> <sc...@artic.edu> Date: 24/12/2017 22:10 (GMT+00:00) To:
>>>> users@jena.apache.org Subject: Python bindings?
>>>> Hello,
>>>> I am writing a LDP server using Python's RDFlib and Fuseki/TDB as a
>>>> back
>>>> end store.
>>>>
>>>> Right now my application is very slow, I suspect due to the HTTP
>>>> overhead: profiling shows a large chunk of time waiting for sockets.
>>>>
>>>> Is there a reliable way to write Python code against the Fuseki Java
>>>> API? I understand that Fuseki is written in Java and there are no
>>>> native
>>>> Python bindings. I have looked at options such as Jython, Jpype and
>>>> PyJnius but I am wondering how reliable these options are. Any
>>>> suggestions?
>>>>
>>>> Thanks,
>>>> Stefano
>>>>
>>>
>
Re: Python bindings?
Posted by Stefano Cossu <sc...@artic.edu>.
Hi Andy,
By doing a straight POST on Fuseki/TDB/Jetty, on a quad-core i7 laptop
with 12Gb RAM:
time curl -i --data-binary 'WITH <info:graph/__root__> DELETE {}
INSERT{<info:s#1> <info:p#1> <info:o#1> . } WHERE {}'
-H'Content-Type:application/sparql-update'
'http://localhost:3030/lakesuperior-dev/update'
HTTP/1.1 204 No Content
Date: Thu, 28 Dec 2017 04:34:56 GMT
Fuseki-Request-ID: 12
real 0m0.144s
user 0m0.010s
sys 0m0.002s
Fuseki log:
[2017-12-27 22:34:56] Fuseki INFO [12] POST
http://localhost:3030/lakesuperior-dev/update
[2017-12-27 22:34:56] Fuseki INFO [12] POST /lakesuperior-dev ::
'update' :: [application/sparql-update] ?
[2017-12-27 22:34:56] Fuseki INFO [12] 204 No Content (125 ms)
There is some HTTP overhead indeed but as you suggest it seems to be
mostly Fuseki doing work. The time for sending the same request goes
from 120 to 189ms. Would you consider this normal and should I settle on
it?
This is important for me because so far I have bundled more complex
requests in one SPARQL update or query request to avoid the HTTP tax,
but if that were less severe than having Fuseki parse one complex query
I could rethink my application code.
Thanks,
Stefano
On 12/26/2017 11:50 AM, Andy Seaborne wrote:
>> I suspect due to the HTTP overhead: profiling shows a large chunk of
>> time waiting for sockets.
>
> If it waiting, then either it is because Fuseki is doing work (see the
> log file which has entries at start and end of an operation), or the
> client is waiting (maybe connection management issues?).
>
> Fuseki does keep the connection open (connection caching). If log looks
> correct, how long is the client waiting?
>
> Andy
>
> On 26/12/17 03:25, Stefano Cossu wrote:
>> Dick,
>> I am interested in hearing the reasons behind your developers dropping
>> RDFLib, which I find very convenient for de/serializing RDF but I feel
>> like it is somewhat brittle and quite obscure in the back end
>> connection part. I think that your approach to using straight HTTP
>> calls for that may be a better choice.
>>
>> Also, thanks for the tip on Thrift. I am not familiar with it but I
>> would be interested in knowing how your team is building Python
>> bindings for the Jena API if it is meant to become a public project at
>> some point.
>>
>> Best,
>> Stefano
>>
>>
>> On 12/24/2017 04:33 PM, dandh988 wrote:
>>> We use Python against Jena/Fuseki/CustomHTTP and find direct SPARQL
>>> against the endpoint to be "fast". The Python Devs dropped using the
>>> RDFLib.
>>> We also have a Thirft connection in development which is proving
>>> useful for low level Jena API access.
>>>
>>> Dick
>>> -------- Original message --------From: Stefano Cossu
>>> <sc...@artic.edu> Date: 24/12/2017 22:10 (GMT+00:00) To:
>>> users@jena.apache.org Subject: Python bindings?
>>> Hello,
>>> I am writing a LDP server using Python's RDFlib and Fuseki/TDB as a back
>>> end store.
>>>
>>> Right now my application is very slow, I suspect due to the HTTP
>>> overhead: profiling shows a large chunk of time waiting for sockets.
>>>
>>> Is there a reliable way to write Python code against the Fuseki Java
>>> API? I understand that Fuseki is written in Java and there are no native
>>> Python bindings. I have looked at options such as Jython, Jpype and
>>> PyJnius but I am wondering how reliable these options are. Any
>>> suggestions?
>>>
>>> Thanks,
>>> Stefano
>>>
>>
--
Stefano Cossu
Director of Application Services, Collections
The Art Institute of Chicago
116 S. Michigan Ave.
Chicago, IL 60603
312-499-4026
Re: Python bindings?
Posted by Andy Seaborne <an...@apache.org>.
> I suspect due to the HTTP overhead: profiling shows a large chunk of time waiting for sockets.
If it waiting, then either it is because Fuseki is doing work (see the
log file which has entries at start and end of an operation), or the
client is waiting (maybe connection management issues?).
Fuseki does keep the connection open (connection caching). If log looks
correct, how long is the client waiting?
Andy
On 26/12/17 03:25, Stefano Cossu wrote:
> Dick,
> I am interested in hearing the reasons behind your developers dropping
> RDFLib, which I find very convenient for de/serializing RDF but I feel
> like it is somewhat brittle and quite obscure in the back end connection
> part. I think that your approach to using straight HTTP calls for that
> may be a better choice.
>
> Also, thanks for the tip on Thrift. I am not familiar with it but I
> would be interested in knowing how your team is building Python bindings
> for the Jena API if it is meant to become a public project at some point.
>
> Best,
> Stefano
>
>
> On 12/24/2017 04:33 PM, dandh988 wrote:
>> We use Python against Jena/Fuseki/CustomHTTP and find direct SPARQL
>> against the endpoint to be "fast". The Python Devs dropped using the
>> RDFLib.
>> We also have a Thirft connection in development which is proving
>> useful for low level Jena API access.
>>
>> Dick
>> -------- Original message --------From: Stefano Cossu
>> <sc...@artic.edu> Date: 24/12/2017 22:10 (GMT+00:00) To:
>> users@jena.apache.org Subject: Python bindings?
>> Hello,
>> I am writing a LDP server using Python's RDFlib and Fuseki/TDB as a back
>> end store.
>>
>> Right now my application is very slow, I suspect due to the HTTP
>> overhead: profiling shows a large chunk of time waiting for sockets.
>>
>> Is there a reliable way to write Python code against the Fuseki Java
>> API? I understand that Fuseki is written in Java and there are no native
>> Python bindings. I have looked at options such as Jython, Jpype and
>> PyJnius but I am wondering how reliable these options are. Any
>> suggestions?
>>
>> Thanks,
>> Stefano
>>
>
Re: Python bindings?
Posted by Stefano Cossu <sc...@artic.edu>.
Dick,
I am interested in hearing the reasons behind your developers dropping
RDFLib, which I find very convenient for de/serializing RDF but I feel
like it is somewhat brittle and quite obscure in the back end connection
part. I think that your approach to using straight HTTP calls for that
may be a better choice.
Also, thanks for the tip on Thrift. I am not familiar with it but I
would be interested in knowing how your team is building Python bindings
for the Jena API if it is meant to become a public project at some point.
Best,
Stefano
On 12/24/2017 04:33 PM, dandh988 wrote:
> We use Python against Jena/Fuseki/CustomHTTP and find direct SPARQL against the endpoint to be "fast". The Python Devs dropped using the RDFLib.
> We also have a Thirft connection in development which is proving useful for low level Jena API access.
>
> Dick
> -------- Original message --------From: Stefano Cossu <sc...@artic.edu> Date: 24/12/2017 22:10 (GMT+00:00) To: users@jena.apache.org Subject: Python bindings?
> Hello,
> I am writing a LDP server using Python's RDFlib and Fuseki/TDB as a back
> end store.
>
> Right now my application is very slow, I suspect due to the HTTP
> overhead: profiling shows a large chunk of time waiting for sockets.
>
> Is there a reliable way to write Python code against the Fuseki Java
> API? I understand that Fuseki is written in Java and there are no native
> Python bindings. I have looked at options such as Jython, Jpype and
> PyJnius but I am wondering how reliable these options are. Any suggestions?
>
> Thanks,
> Stefano
>
--
Stefano Cossu
Director of Application Services, Collections
The Art Institute of Chicago
116 S. Michigan Ave.
Chicago, IL 60603
312-499-4026
Re: Python bindings?
Posted by dandh988 <da...@gmail.com>.
We use Python against Jena/Fuseki/CustomHTTP and find direct SPARQL against the endpoint to be "fast". The Python Devs dropped using the RDFLib.
We also have a Thirft connection in development which is proving useful for low level Jena API access.
Dick
-------- Original message --------From: Stefano Cossu <sc...@artic.edu> Date: 24/12/2017 22:10 (GMT+00:00) To: users@jena.apache.org Subject: Python bindings?
Hello,
I am writing a LDP server using Python's RDFlib and Fuseki/TDB as a back
end store.
Right now my application is very slow, I suspect due to the HTTP
overhead: profiling shows a large chunk of time waiting for sockets.
Is there a reliable way to write Python code against the Fuseki Java
API? I understand that Fuseki is written in Java and there are no native
Python bindings. I have looked at options such as Jython, Jpype and
PyJnius but I am wondering how reliable these options are. Any suggestions?
Thanks,
Stefano