You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@vxquery.apache.org by Erandi Ganepola <er...@gmail.com> on 2017/08/03 18:21:15 UTC

VXQuery RESTful API: Measuring query execution time

Hi all,

As per the REST API specification [1], REST API operates in 2 steps.

1. Users will be sending a request which contains the query to be executed.
Then the REST API will compile the query and submit to hyracks to be
executed. Once submitted, REST API returns a response containing
compilations time, Abstract Syntax Tree, Runtime Plan and etc to the user
along with a result ID which will later be used to fetch results by the
user.

2. User sends a request to the REST API with the *result ID *previously
received and the REST API will internally read the results from hyracks
corresponding to that result ID. Then return the results to the user.

Since there is no link between the time at which user submits the query and
the time at which the user reads results, we cannot measure the *time for
the query execution*. Previously, this measurement was done by vxquery
waiting for hyracks to complete query execution and fetch results.

When it comes to AsterixDB, they are using synchronous requests (submit
query and wait) in the REST API (Yes, they have an asynchronous version,
which doesn't measure execution time correctly since it is using an
executor service [2]. see the ASYNC execution path under it). As per my
understanding, what I have mentioned above are correct.

Am I missing something? If not, what are your thoughts on this scenario?
How can we tackle this?

[1] https://cwiki.apache.org/confluence/display/VXQUERY/
SwaggerIO+Configuration
[2] https://github.com/apache/asterixdb/blob/master/
asterixdb/asterix-app/src/main/java/org/apache/asterix/api/http/server/
QueryServiceServlet.java

Thanks and Regards,
*Erandi Ganepola*
Undergraduate
BSc. Management and Information Technology (IT Sp.)
University of Kelaniya, Sri Lanka

<https://www.linkedin.com/in/erandiganepola/>
<https://github.com/erandiganepola>   <https://twitter.com/erandiganepola>

Re: VXQuery RESTful API: Measuring query execution time

Posted by Michael Carey <mj...@ics.uci.edu>.
AsterixDB actually has three options:

  - Execute query and return results, all synchronously.

  - Execute query and return a HANDLE to the results, synchronously - 
the handle can be used later to get the results.

  - Execute the query asynchronously and return a HANDLE - the handle 
can be used to get the query status and results.

The first two can both measure the end-to-end time - the first one 
includes query execution and results returning in the time, while the 
second one includes query execution but not results returning in the 
time.  (Note that the results are there, i.e., the query is done, in 
both timings - it's just a question of whether or not the potentially 
expensive gathering and serialization and such of the final results is 
included.)

Cheers,

Mike

PS - The middle one is actually the most useful one for benchmarking, as 
various other systems in HDFS-land stop after producing the results and 
letting the app know they're ready for pickup.


On 8/12/17 11:37 PM, Erandi Ganepola wrote:
> Hi Preston,
>
> Yes, I have proceeded in that way. Now there is a query parameter to
> specify whether the results needs to be received in sync or async. Async
> mode is similar to what we previously had. Sync mode will return the
> results after waiting for hyracks job to complete.
>
> Thanks and Regards,
>
> *Erandi Ganepola*
> Undergraduate
> BSc. Management and Information Technology (IT Sp.)
> University of Kelaniya, Sri Lanka
>
> <https://www.linkedin.com/in/erandiganepola/>
> <https://github.com/erandiganepola>   <https://twitter.com/erandiganepola>
>
> On Tue, Aug 8, 2017 at 3:20 AM, Preston Carman <pr...@apache.org> wrote:
>
>> I think a few offline discussion have clarified the issue with timing.
>> As we move forward, I think timing is important. As you have
>> highlighted, different types of timing values are available with
>> synchronous and asynchronous requests.
>>
>> asynchronous
>>   query request - returns compile time
>>   result request - return query execution time
>>
>> synchronous
>>   query request - returns compile time and query execution time
>>
>> It seems you are going ahead with adding a synchronous option for the
>> REST service. As for asynchronous result request, AsterixDB does have
>> a solution for this request. I believe Ian has shown you where to
>> look.
>>
>> On Thu, Aug 3, 2017 at 11:21 AM, Erandi Ganepola
>> <er...@gmail.com> wrote:
>>> Hi all,
>>>
>>> As per the REST API specification [1], REST API operates in 2 steps.
>>>
>>> 1. Users will be sending a request which contains the query to be
>> executed.
>>> Then the REST API will compile the query and submit to hyracks to be
>>> executed. Once submitted, REST API returns a response containing
>>> compilations time, Abstract Syntax Tree, Runtime Plan and etc to the user
>>> along with a result ID which will later be used to fetch results by the
>>> user.
>>>
>>> 2. User sends a request to the REST API with the *result ID *previously
>>> received and the REST API will internally read the results from hyracks
>>> corresponding to that result ID. Then return the results to the user.
>>>
>>> Since there is no link between the time at which user submits the query
>> and
>>> the time at which the user reads results, we cannot measure the *time for
>>> the query execution*. Previously, this measurement was done by vxquery
>>> waiting for hyracks to complete query execution and fetch results.
>>>
>>> When it comes to AsterixDB, they are using synchronous requests (submit
>>> query and wait) in the REST API (Yes, they have an asynchronous version,
>>> which doesn't measure execution time correctly since it is using an
>>> executor service [2]. see the ASYNC execution path under it). As per my
>>> understanding, what I have mentioned above are correct.
>>>
>>> Am I missing something? If not, what are your thoughts on this scenario?
>>> How can we tackle this?
>>>
>>> [1] https://cwiki.apache.org/confluence/display/VXQUERY/
>>> SwaggerIO+Configuration
>>> [2] https://github.com/apache/asterixdb/blob/master/
>>> asterixdb/asterix-app/src/main/java/org/apache/asterix/api/http/server/
>>> QueryServiceServlet.java
>>>
>>> Thanks and Regards,
>>> *Erandi Ganepola*
>>> Undergraduate
>>> BSc. Management and Information Technology (IT Sp.)
>>> University of Kelaniya, Sri Lanka
>>>
>>> <https://www.linkedin.com/in/erandiganepola/>
>>> <https://github.com/erandiganepola>   <https://twitter.com/
>> erandiganepola>
>>


Re: VXQuery RESTful API: Measuring query execution time

Posted by Erandi Ganepola <er...@gmail.com>.
Hi Preston,

Yes, I have proceeded in that way. Now there is a query parameter to
specify whether the results needs to be received in sync or async. Async
mode is similar to what we previously had. Sync mode will return the
results after waiting for hyracks job to complete.

Thanks and Regards,

*Erandi Ganepola*
Undergraduate
BSc. Management and Information Technology (IT Sp.)
University of Kelaniya, Sri Lanka

<https://www.linkedin.com/in/erandiganepola/>
<https://github.com/erandiganepola>   <https://twitter.com/erandiganepola>

On Tue, Aug 8, 2017 at 3:20 AM, Preston Carman <pr...@apache.org> wrote:

> I think a few offline discussion have clarified the issue with timing.
> As we move forward, I think timing is important. As you have
> highlighted, different types of timing values are available with
> synchronous and asynchronous requests.
>
> asynchronous
>  query request - returns compile time
>  result request - return query execution time
>
> synchronous
>  query request - returns compile time and query execution time
>
> It seems you are going ahead with adding a synchronous option for the
> REST service. As for asynchronous result request, AsterixDB does have
> a solution for this request. I believe Ian has shown you where to
> look.
>
> On Thu, Aug 3, 2017 at 11:21 AM, Erandi Ganepola
> <er...@gmail.com> wrote:
> > Hi all,
> >
> > As per the REST API specification [1], REST API operates in 2 steps.
> >
> > 1. Users will be sending a request which contains the query to be
> executed.
> > Then the REST API will compile the query and submit to hyracks to be
> > executed. Once submitted, REST API returns a response containing
> > compilations time, Abstract Syntax Tree, Runtime Plan and etc to the user
> > along with a result ID which will later be used to fetch results by the
> > user.
> >
> > 2. User sends a request to the REST API with the *result ID *previously
> > received and the REST API will internally read the results from hyracks
> > corresponding to that result ID. Then return the results to the user.
> >
> > Since there is no link between the time at which user submits the query
> and
> > the time at which the user reads results, we cannot measure the *time for
> > the query execution*. Previously, this measurement was done by vxquery
> > waiting for hyracks to complete query execution and fetch results.
> >
> > When it comes to AsterixDB, they are using synchronous requests (submit
> > query and wait) in the REST API (Yes, they have an asynchronous version,
> > which doesn't measure execution time correctly since it is using an
> > executor service [2]. see the ASYNC execution path under it). As per my
> > understanding, what I have mentioned above are correct.
> >
> > Am I missing something? If not, what are your thoughts on this scenario?
> > How can we tackle this?
> >
> > [1] https://cwiki.apache.org/confluence/display/VXQUERY/
> > SwaggerIO+Configuration
> > [2] https://github.com/apache/asterixdb/blob/master/
> > asterixdb/asterix-app/src/main/java/org/apache/asterix/api/http/server/
> > QueryServiceServlet.java
> >
> > Thanks and Regards,
> > *Erandi Ganepola*
> > Undergraduate
> > BSc. Management and Information Technology (IT Sp.)
> > University of Kelaniya, Sri Lanka
> >
> > <https://www.linkedin.com/in/erandiganepola/>
> > <https://github.com/erandiganepola>   <https://twitter.com/
> erandiganepola>
>

Re: VXQuery RESTful API: Measuring query execution time

Posted by Preston Carman <pr...@apache.org>.
I think a few offline discussion have clarified the issue with timing.
As we move forward, I think timing is important. As you have
highlighted, different types of timing values are available with
synchronous and asynchronous requests.

asynchronous
 query request - returns compile time
 result request - return query execution time

synchronous
 query request - returns compile time and query execution time

It seems you are going ahead with adding a synchronous option for the
REST service. As for asynchronous result request, AsterixDB does have
a solution for this request. I believe Ian has shown you where to
look.

On Thu, Aug 3, 2017 at 11:21 AM, Erandi Ganepola
<er...@gmail.com> wrote:
> Hi all,
>
> As per the REST API specification [1], REST API operates in 2 steps.
>
> 1. Users will be sending a request which contains the query to be executed.
> Then the REST API will compile the query and submit to hyracks to be
> executed. Once submitted, REST API returns a response containing
> compilations time, Abstract Syntax Tree, Runtime Plan and etc to the user
> along with a result ID which will later be used to fetch results by the
> user.
>
> 2. User sends a request to the REST API with the *result ID *previously
> received and the REST API will internally read the results from hyracks
> corresponding to that result ID. Then return the results to the user.
>
> Since there is no link between the time at which user submits the query and
> the time at which the user reads results, we cannot measure the *time for
> the query execution*. Previously, this measurement was done by vxquery
> waiting for hyracks to complete query execution and fetch results.
>
> When it comes to AsterixDB, they are using synchronous requests (submit
> query and wait) in the REST API (Yes, they have an asynchronous version,
> which doesn't measure execution time correctly since it is using an
> executor service [2]. see the ASYNC execution path under it). As per my
> understanding, what I have mentioned above are correct.
>
> Am I missing something? If not, what are your thoughts on this scenario?
> How can we tackle this?
>
> [1] https://cwiki.apache.org/confluence/display/VXQUERY/
> SwaggerIO+Configuration
> [2] https://github.com/apache/asterixdb/blob/master/
> asterixdb/asterix-app/src/main/java/org/apache/asterix/api/http/server/
> QueryServiceServlet.java
>
> Thanks and Regards,
> *Erandi Ganepola*
> Undergraduate
> BSc. Management and Information Technology (IT Sp.)
> University of Kelaniya, Sri Lanka
>
> <https://www.linkedin.com/in/erandiganepola/>
> <https://github.com/erandiganepola>   <https://twitter.com/erandiganepola>