You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@phoenix.apache.org by Vijay Kukkala <vi...@gmail.com> on 2015/01/16 22:03:19 UTC

Scan performance using JDBC result impacted by limit

I am using plain JDBC code to execute a query against a Phoenix cluster
running 4.0.0 with Hbase 0.98.x

My query is as follows against a table with single column_family.
select cid,ts,id,pc,un,ug,ui,s,inf,sm,mst,se  from wcs_re where cid = ? and
ts >= ? and ts <= ? limit ?

<cid, ts, id> are my primary keys for the table.

On the client side, the retrieved values are Iterated and converted to a
domain object on the client side. Since the query was taking long, I
started measuring the times taken to do the conversion for each object.

The issue I see is, as I increase the limit clause value in the query from
100, 1000, 2000 and so on, my domain conversion time increases gradually
from <1 , 9, 17 ms for each record retrieved from the resultset. Ideally, I
would have thought that conversion time would be constant.

Can somebody help shed some light on this?

thanks
Vijay

Re: Scan performance using JDBC result impacted by limit

Posted by Vijay Kukkala <ac...@gmail.com>.

Samarth,

Thank you for your response.

To clarify, I am not calling the PhoenixResultSet.toString().

While calling the resultSet.next() to iterate thru the items retrieved, I
make a call to resultSet.getString("pc") where "pc" is the column name and
of type VARCHAR.

The call resultSet.getString("pc") is the one where retrieving data takes a
lot of time compared to resultSet.getBytes("pc") (atleast 3-5 times more)

Hope, I was to able to make it clear what I my issue is.

thanks,
Vijay

On Wed Jan 21 2015 at 1:14:23 PM Samarth Jain <sa...@gmail.com>
wrote:

> Vijay,
>
> Is there a reason why you are doing PhoenixResultSet.string()? Is it for
> logging purposes?
>
> Regarding your question regarding increase in object creation time, that
> doesn't seem like it is phoenix related. Are you seeing an increase in time
> for resultset.next() or are you seeing an increase in time for
> resultset.getObject()?
>
> -Samarth
> On Wednesday, January 21, 2015, Vijay Kukkala <vi...@gmail.com>
> wrote:
>
>> Just to add more info on issue we were facing and workaround applied
>> the PhoenixResultSet.getString() takes way much time than
>> PhoenixResultSet.getBytes().
>>
>> the Formatting and other logic in the getString() increases with the
>> number of items to be processed.
>>
>> Somebody might want to take a look at this.
>>
>>
>>
>> On Fri Jan 16 2015 at 3:03:18 PM Vijay Kukkala <vi...@gmail.com>
>> wrote:
>>
>>> I am using plain JDBC code to execute a query against a Phoenix cluster
>>> running 4.0.0 with Hbase 0.98.x
>>>
>>> My query is as follows against a table with single column_family.
>>> select cid,ts,id,pc,un,ug,ui,s,inf,sm,mst,se  from wcs_re where cid = ?
>>> and ts >= ? and ts <= ? limit ?
>>>
>>> <cid, ts, id> are my primary keys for the table.
>>>
>>> On the client side, the retrieved values are Iterated and converted to a
>>> domain object on the client side. Since the query was taking long, I
>>> started measuring the times taken to do the conversion for each object.
>>>
>>> The issue I see is, as I increase the limit clause value in the query
>>> from 100, 1000, 2000 and so on, my domain conversion time increases
>>> gradually from <1 , 9, 17 ms for each record retrieved from the resultset.
>>> Ideally, I would have thought that conversion time would be constant.
>>>
>>> Can somebody help shed some light on this?
>>>
>>> thanks
>>> Vijay
>>>
>>

Re: Scan performance using JDBC result impacted by limit

Posted by Samarth Jain <sa...@gmail.com>.

Vijay,

Is there a reason why you are doing PhoenixResultSet.string()? Is it for
logging purposes?

Regarding your question regarding increase in object creation time, that
doesn't seem like it is phoenix related. Are you seeing an increase in time
for resultset.next() or are you seeing an increase in time for
resultset.getObject()?

-Samarth
On Wednesday, January 21, 2015, Vijay Kukkala <vi...@gmail.com>
wrote:

> Just to add more info on issue we were facing and workaround applied
> the PhoenixResultSet.getString() takes way much time than
> PhoenixResultSet.getBytes().
>
> the Formatting and other logic in the getString() increases with the
> number of items to be processed.
>
> Somebody might want to take a look at this.
>
>
>
> On Fri Jan 16 2015 at 3:03:18 PM Vijay Kukkala <vijay.kukkala@gmail.com
> <javascript:_e(%7B%7D,'cvml','vijay.kukkala@gmail.com');>> wrote:
>
>> I am using plain JDBC code to execute a query against a Phoenix cluster
>> running 4.0.0 with Hbase 0.98.x
>>
>> My query is as follows against a table with single column_family.
>> select cid,ts,id,pc,un,ug,ui,s,inf,sm,mst,se  from wcs_re where cid = ?
>> and ts >= ? and ts <= ? limit ?
>>
>> <cid, ts, id> are my primary keys for the table.
>>
>> On the client side, the retrieved values are Iterated and converted to a
>> domain object on the client side. Since the query was taking long, I
>> started measuring the times taken to do the conversion for each object.
>>
>> The issue I see is, as I increase the limit clause value in the query
>> from 100, 1000, 2000 and so on, my domain conversion time increases
>> gradually from <1 , 9, 17 ms for each record retrieved from the resultset.
>> Ideally, I would have thought that conversion time would be constant.
>>
>> Can somebody help shed some light on this?
>>
>> thanks
>> Vijay
>>
>

Re: Scan performance using JDBC result impacted by limit

Posted by Vijay Kukkala <vi...@gmail.com>.

Just to add more info on issue we were facing and workaround applied
the PhoenixResultSet.getString() takes way much time than
PhoenixResultSet.getBytes().

the Formatting and other logic in the getString() increases with the number
of items to be processed.

Somebody might want to take a look at this.



On Fri Jan 16 2015 at 3:03:18 PM Vijay Kukkala <vi...@gmail.com>
wrote:

> I am using plain JDBC code to execute a query against a Phoenix cluster
> running 4.0.0 with Hbase 0.98.x
>
> My query is as follows against a table with single column_family.
> select cid,ts,id,pc,un,ug,ui,s,inf,sm,mst,se  from wcs_re where cid = ?
> and ts >= ? and ts <= ? limit ?
>
> <cid, ts, id> are my primary keys for the table.
>
> On the client side, the retrieved values are Iterated and converted to a
> domain object on the client side. Since the query was taking long, I
> started measuring the times taken to do the conversion for each object.
>
> The issue I see is, as I increase the limit clause value in the query from
> 100, 1000, 2000 and so on, my domain conversion time increases gradually
> from <1 , 9, 17 ms for each record retrieved from the resultset. Ideally, I
> would have thought that conversion time would be constant.
>
> Can somebody help shed some light on this?
>
> thanks
> Vijay
>