You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@jena.apache.org by Dang Nguyen <da...@gmail.com> on 2012/12/17 17:05:29 UTC

SPARQL regular path query runtime

Hi everyone,

I am confused over the results I obtain when trying to measure the
execution time of regular path queries in a model.
Essentially, my codes look as follows:

// start of code
// generate Jena model
...
// generate queryString for regular path query
...

// execute the query on the model i = 10 times

for (int i = 0; i < 10; i++){
                        long startTime = System.nanoTime(); // start timer
        Query q = QueryFactory.create(queryString);
 QueryExecution qexec= QueryExecutionFactory.create( q, model );
ResultSet rs= qexec.execSelect();
 long endTime = System.nanoTime(); // end timer

long duration = endTime - startTime;
 System.out.println(duration);
}// end of code

The results I get (in nanosec) are:
521328304
1606834
2182409
1745501
1310299
1244592
1240752
1256112
1242032
4153618

What I don't understand here is the first result always seems to be way
larger than the rest and also the tenth one always seems to be slightly
different too (from running with different i values).
I am guessing that there are some optimizations being done behind the scene
for executing the same queries on the same model but that is just my
speculation.

I would truly appreciate any help on this confusion of mine.

Thank you very much,
-- 
*Dang Nguyen*

Re: SPARQL regular path query runtime

Posted by Andy Seaborne <an...@apache.org>.

On 17/12/12 16:05, Dang Nguyen wrote:
> Hi everyone,
>
> I am confused over the results I obtain when trying to measure the
> execution time of regular path queries in a model.
> Essentially, my codes look as follows:
>
> // start of code
> // generate Jena model
> ...
> // generate queryString for regular path query
> ...
>
> // execute the query on the model i = 10 times
>
> for (int i = 0; i < 10; i++){
>                          long startTime = System.nanoTime(); // start timer
>          Query q = QueryFactory.create(queryString);
>   QueryExecution qexec= QueryExecutionFactory.create( q, model );
> ResultSet rs= qexec.execSelect();

and you need to get the results:

ResultSetFormatter.consume(rs)

otherwise all you measure is the query compilation costs.

>   long endTime = System.nanoTime(); // end timer
>
> long duration = endTime - startTime;
>   System.out.println(duration);
> }// end of code
>
> The results I get (in nanosec) are:
> 521328304

I don't see a warm up step so this includes Java classloading time.

> 1606834
> 2182409
> 1745501
> 1310299
> 1244592
> 1240752
> 1256112
> 1242032
> 4153618
>
> What I don't understand here is the first result always seems to be way
> larger than the rest and also the tenth one always seems to be slightly
> different too (from running with different i values).

Possibly the 10th encourages the JIT to do something, or some extra GC 
happens.

> I am guessing that there are some optimizations being done behind the scene
> for executing the same queries on the same model but that is just my
> speculation.
>
> I would truly appreciate any help on this confusion of mine.
>
> Thank you very much,
>

Re: SPARQL regular path query runtime

Posted by Chris Dollin <ch...@epimorphics.com>.

On Monday, December 17, 2012 10:05:29 AM Dang Nguyen wrote:

> I am confused over the results I obtain when trying to measure the
> execution time of regular path queries in a model.
> Essentially, my codes look as follows:
> 
> // start of code
> // generate Jena model
> ...
> // generate queryString for regular path query
> ...
> 
> // execute the query on the model i = 10 times
> 
> for (int i = 0; i < 10; i++){
>                         long startTime = System.nanoTime(); // start timer
>         Query q = QueryFactory.create(queryString);
>  QueryExecution qexec= QueryExecutionFactory.create( q, model );
> ResultSet rs= qexec.execSelect();
>  long endTime = System.nanoTime(); // end timer

You haven't pulled the results, so the query might still be executing --
in fact it might not have started yet at all.

> long duration = endTime - startTime;
>  System.out.println(duration);
> }// end of code
> 
> The results I get (in nanosec) are:
> 521328304
> 1606834
> 2182409
> 1745501
> 1310299
> 1244592
> 1240752
> 1256112
> 1242032
> 4153618
> 
> What I don't understand here is the first result always seems to be way
> larger than the rest and also the tenth one always seems to be slightly
> different too (from running with different i values).

I'd expect the long first result is the classloading of all the classes
needed to do the SPARQL query.

The long tenth result might be the first garbage collection.

Chris

"Measurement is hard."

-- 
"I know it was late, but Mountjoy never bothers,                /Archer's Goon/
 so long as it's the full two thousand words."

Epimorphics Ltd, http://www.epimorphics.com
Registered address: Court Lodge, 105 High Street, Portishead, Bristol BS20 6PT
Epimorphics Ltd. is a limited company registered in England (number 7016688)