You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jena.apache.org by George News <ge...@gmx.net> on 2017/02/03 21:41:55 UTC

ResultSetFactory.copyResults performance doubt

Hi all,

On my environment I have created a class in order to encapsulate access
to `Dataset` and `Model`s stored. I didn't realize that the
`QueryExecution` instance was not collected by the GC and it seems that
could lead to memory leak.  In this sense I found that
`ResultSetFactory.copyResults()` detach the `ResultSet` from the
QueryExecution.

My concern now now is about performance: �Is it really consuming to
perform a copy of thousands of data? Would it be better to create
another class acting as a wrapper for the Jena `QueryExecution` and just
only read the `ResultSet` once.

This is the kind of thing I'm referring to.

```java
// query is a Query
// ts is the class encapsulating the management of the triple store
SparqlExecutor exec = new SparqlExecutor(query, ts);
ResultSet res = exec.execute();
// Do what ever with the resultset
exec.close()

```

Within the `execute` function I manage the transaction, get the
corresponding named graphs, etc.

Thanks a lot.

Jorge

P.S.: It is difficult to explain things without disclosing the code :(
sorry.


Re: ResultSetFactory.copyResults performance doubt

Posted by "A. Soroka" <aj...@virginia.edu>.
>> JENA-1215 : ResultSetCloseable -- this will be in Jena 3.2.0.
>> This may help if you want to pass the ResultSet around - it keeps the  QueryExecution around to close it.
>> 
>> 	Andy
> 
> That’s great. This is actually my class 😊 I’m just waiting for the voting to finish jejeje

Feel free to test the release candidate and cast a non-binding vote. We have until the end of tomorrow to receive the third binding vote to do a release and you might remind someone else to cast their vote! :grin:

---
A. Soroka
The University of Virginia Library

> On Feb 5, 2017, at 4:09 AM, <ge...@gmx.net> <ge...@gmx.net> wrote:
> 
> 
> 
> Sent from jlanza_teclast
> 
> From: Andy Seaborne
> Sent: sábado, 4 de febrero de 2017 16:56
> To: users@jena.apache.org
> Subject: Re: ResultSetFactory.copyResults performance doubt
> 
> 
> 
> On 03/02/17 21:41, George News wrote:
>> Hi all,
>> 
>> On my environment I have created a class in order to encapsulate access
>> to `Dataset` and `Model`s stored. I didn't realize that the
>> `QueryExecution` instance was not collected by the GC
> 
> Just because
> 
> It should GC'ed.  This depends on the storage but all the memory ones 
> and TDB that come with Jena itself don't rely on ".close()". It is as 
> much a case of recognizing that QueryExecution is use-once.
> 
>> and it seems that
>> could lead to memory leak.  In this sense I found that
>> `ResultSetFactory.copyResults()` detach the `ResultSet` from the
>> QueryExecution.
> 
> ResultSetFactory.copyResults() is materializing the results and creating 
> a List<> to keep them in so it isn't a huge copy.
> 
> If the app is going to iterate over all the results, this is more a case 
> of moving where work is done, not increasing or decreasing the work.
> 
>> My concern now now is about performance: ¿Is it really consuming to
>> perform a copy of thousands of data? Would it be better to create
>> another class acting as a wrapper for the Jena `QueryExecution` and just
>> only read the `ResultSet` once.
>> 
>> This is the kind of thing I'm referring to.
>> 
>> ```java
>> // query is a Query
>> // ts is the class encapsulating the management of the triple store
>> SparqlExecutor exec = new SparqlExecutor(query, ts);
>> ResultSet res = exec.execute();
>> // Do what ever with the resultset
>> exec.close()
>> 
>> ```
>> 
>> Within the `execute` function I manage the transaction, get the
>> corresponding named graphs, etc.
> 
> JENA-1215 : ResultSetCloseable -- this will be in Jena 3.2.0.
> 
> This may help if you want to pass the ResultSet around - it keeps the 
> QueryExecution around to close it.
> 
> 	Andy
> 
> 
> That’s great. This is actually my class 😊 I’m just waiting for the voting to finish jejeje
>> 
>> Thanks a lot.
>> 
>> Jorge
>> 
>> P.S.: It is difficult to explain things without disclosing the code :(
>> sorry.
>> 
> 


RE: ResultSetFactory.copyResults performance doubt

Posted by ge...@gmx.net.

Sent from jlanza_teclast

From: Andy Seaborne
Sent: sábado, 4 de febrero de 2017 16:56
To: users@jena.apache.org
Subject: Re: ResultSetFactory.copyResults performance doubt



On 03/02/17 21:41, George News wrote:
> Hi all,
>
> On my environment I have created a class in order to encapsulate access
> to `Dataset` and `Model`s stored. I didn't realize that the
> `QueryExecution` instance was not collected by the GC

Just because

It should GC'ed.  This depends on the storage but all the memory ones 
and TDB that come with Jena itself don't rely on ".close()". It is as 
much a case of recognizing that QueryExecution is use-once.

> and it seems that
> could lead to memory leak.  In this sense I found that
> `ResultSetFactory.copyResults()` detach the `ResultSet` from the
> QueryExecution.

ResultSetFactory.copyResults() is materializing the results and creating 
a List<> to keep them in so it isn't a huge copy.

If the app is going to iterate over all the results, this is more a case 
of moving where work is done, not increasing or decreasing the work.

> My concern now now is about performance: ¿Is it really consuming to
> perform a copy of thousands of data? Would it be better to create
> another class acting as a wrapper for the Jena `QueryExecution` and just
> only read the `ResultSet` once.
>
> This is the kind of thing I'm referring to.
>
> ```java
> // query is a Query
> // ts is the class encapsulating the management of the triple store
> SparqlExecutor exec = new SparqlExecutor(query, ts);
> ResultSet res = exec.execute();
> // Do what ever with the resultset
> exec.close()
>
> ```
>
> Within the `execute` function I manage the transaction, get the
> corresponding named graphs, etc.

JENA-1215 : ResultSetCloseable -- this will be in Jena 3.2.0.

This may help if you want to pass the ResultSet around - it keeps the 
QueryExecution around to close it.

	Andy


That’s great. This is actually my class 😊 I’m just waiting for the voting to finish jejeje
>
> Thanks a lot.
>
> Jorge
>
> P.S.: It is difficult to explain things without disclosing the code :(
> sorry.
>


Re: ResultSetFactory.copyResults performance doubt

Posted by Andy Seaborne <an...@apache.org>.

On 03/02/17 21:41, George News wrote:
> Hi all,
>
> On my environment I have created a class in order to encapsulate access
> to `Dataset` and `Model`s stored. I didn't realize that the
> `QueryExecution` instance was not collected by the GC

Just because

It should GC'ed.  This depends on the storage but all the memory ones 
and TDB that come with Jena itself don't rely on ".close()". It is as 
much a case of recognizing that QueryExecution is use-once.

> and it seems that
> could lead to memory leak.  In this sense I found that
> `ResultSetFactory.copyResults()` detach the `ResultSet` from the
> QueryExecution.

ResultSetFactory.copyResults() is materializing the results and creating 
a List<> to keep them in so it isn't a huge copy.

If the app is going to iterate over all the results, this is more a case 
of moving where work is done, not increasing or decreasing the work.

> My concern now now is about performance: �Is it really consuming to
> perform a copy of thousands of data? Would it be better to create
> another class acting as a wrapper for the Jena `QueryExecution` and just
> only read the `ResultSet` once.
>
> This is the kind of thing I'm referring to.
>
> ```java
> // query is a Query
> // ts is the class encapsulating the management of the triple store
> SparqlExecutor exec = new SparqlExecutor(query, ts);
> ResultSet res = exec.execute();
> // Do what ever with the resultset
> exec.close()
>
> ```
>
> Within the `execute` function I manage the transaction, get the
> corresponding named graphs, etc.

JENA-1215 : ResultSetCloseable -- this will be in Jena 3.2.0.

This may help if you want to pass the ResultSet around - it keeps the 
QueryExecution around to close it.

	Andy

>
> Thanks a lot.
>
> Jorge
>
> P.S.: It is difficult to explain things without disclosing the code :(
> sorry.
>