You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@phoenix.apache.org by Vikas Agarwal <vi...@infoobjects.com> on 2014/09/05 14:12:52 UTC

Phoenix response time

Hi,

Preface: We are testing phoenix using Hortonworks distribution for HBase on
Amazon EC2 instance (r3.large <http://aws.amazon.com/ec2/pricing/>, 2
CPU/15 GB RAM).

With contrast to performance benchmarks
<http://phoenix.apache.org/performance.html>, I found Phoenix to be very
slow in querying even on primary key or row key. So, tried to increase the
RAM for HBase and Phoenix and increasing the CPU and RAM by upgrading the
EC2 machine type to r3.xlarge (4 CPU, 30 GB RAM). Results were like this:

Time takes in returning result of query on row key:
With Storm running and very less RAM available: *50 sec*

With Storm stopped and RAM available to Phoenix and HBase: *18 sec*

With new machine of next higher category (4 CPU and 30 GB RAM): *8 sec*

Pure HBase query by row key with Storm stopped and (2 CPU, 15 GB RAM): *0.0150
seconds*. :)

So, the difference seems to be many fold of what native HBase is providing
to us. I am not able to understand how it can be possible? What I am
missing here?

-- 
Regards,
Vikas Agarwal
91 – 9928301411

InfoObjects, Inc.
Execution Matters
http://www.infoobjects.com
2041 Mission College Boulevard, #280
Santa Clara, CA 95054
+1 (408) 988-2000 Work
+1 (408) 716-2726 Fax

Re: Phoenix response time

Posted by James Taylor <ja...@apache.org>.
One other factor, Vikas. Make sure you don't include the connection
time in your measurements. For example, if you use psql.py to run the
query, it connects to the cluster, runs the query, and disconnects.
Establishing a connection can take 2-3 seconds. If you use sqlline.py
on the other hand, it keeps the connection open once started. Then you
can get a better idea of query times as you run them one after
another.

A second important factor is what is already in the HBase block cache.
Make sure you either run the query several times (if you want to take
this into account), or clear the block cache after each run (if you
don't).

Thanks,
James

On Fri, Sep 5, 2014 at 9:00 AM, James Taylor <ja...@apache.org> wrote:
> Hi Vikas,
> Please post your schema and query as it's difficult to have a discussion
> without those. Also if you could post your HBase code, that would be
> interesting as well.
> Thanks,
> James
>
>
> On Friday, September 5, 2014, yeshwanth kumar <ye...@gmail.com> wrote:
>>
>> hi vikas,
>>
>> we used phoenix on a 4 core/23Gb machine, as a single node setup.
>> used HDP 2.1
>> our table has 50-70M rows,
>> select on that table took less than 2 seconds.
>> Aggregation queries took less than 8 seconds.
>> for achieving good performance we created secondary index on the table.
>>
>> make sure you finetuned hbase,
>> enabling compression on the data makes a difference in response.
>> if u distribute the data and load over all regions in hbase,
>> look at the performance tips mentioned in phoenix blog
>>
>> -yeshwanth
>>
>>
>>
>> Cheers,
>> Yeshwanth
>>
>>
>>
>> On Fri, Sep 5, 2014 at 5:42 PM, Vikas Agarwal <vi...@infoobjects.com>
>> wrote:
>>>
>>> Hi,
>>>
>>> Preface: We are testing phoenix using Hortonworks distribution for HBase
>>> on Amazon EC2 instance (r3.large, 2 CPU/15 GB RAM).
>>>
>>> With contrast to performance benchmarks, I found Phoenix to be very slow
>>> in querying even on primary key or row key. So, tried to increase the RAM
>>> for HBase and Phoenix and increasing the CPU and RAM by upgrading the EC2
>>> machine type to r3.xlarge (4 CPU, 30 GB RAM). Results were like this:
>>>
>>> Time takes in returning result of query on row key:
>>> With Storm running and very less RAM available: 50 sec
>>>
>>> With Storm stopped and RAM available to Phoenix and HBase: 18 sec
>>>
>>> With new machine of next higher category (4 CPU and 30 GB RAM): 8 sec
>>>
>>> Pure HBase query by row key with Storm stopped and (2 CPU, 15 GB RAM):
>>> 0.0150 seconds. :)
>>>
>>> So, the difference seems to be many fold of what native HBase is
>>> providing to us. I am not able to understand how it can be possible? What I
>>> am missing here?
>>>
>>> --
>>> Regards,
>>> Vikas Agarwal
>>> 91 – 9928301411
>>>
>>> InfoObjects, Inc.
>>> Execution Matters
>>> http://www.infoobjects.com
>>> 2041 Mission College Boulevard, #280
>>> Santa Clara, CA 95054
>>> +1 (408) 988-2000 Work
>>> +1 (408) 716-2726 Fax
>>
>>
>

Re: Phoenix response time

Posted by James Taylor <ja...@apache.org>.
Hi Vikas,
Please post your schema and query as it's difficult to have a discussion
without those. Also if you could post your HBase code, that would be
interesting as well.
Thanks,
James

On Friday, September 5, 2014, yeshwanth kumar <ye...@gmail.com> wrote:

> hi vikas,
>
> we used phoenix on a 4 core/23Gb machine, as a single node setup.
> used HDP 2.1
> our table has 50-70M rows,
> select on that table took less than 2 seconds.
> Aggregation queries took less than 8 seconds.
> for achieving good performance we created secondary index on the table.
>
> make sure you finetuned hbase,
> enabling compression on the data makes a difference in response.
> if u distribute the data and load over all regions in hbase,
> look at the performance tips mentioned in phoenix blog
>
> -yeshwanth
>
>
>
> Cheers,
> Yeshwanth
>
>
>
> On Fri, Sep 5, 2014 at 5:42 PM, Vikas Agarwal <vikas@infoobjects.com
> <javascript:_e(%7B%7D,'cvml','vikas@infoobjects.com');>> wrote:
>
>> Hi,
>>
>> Preface: We are testing phoenix using Hortonworks distribution for HBase
>> on Amazon EC2 instance (r3.large <http://aws.amazon.com/ec2/pricing/>, 2
>> CPU/15 GB RAM).
>>
>> With contrast to performance benchmarks
>> <http://phoenix.apache.org/performance.html>, I found Phoenix to be very
>> slow in querying even on primary key or row key. So, tried to increase the
>> RAM for HBase and Phoenix and increasing the CPU and RAM by upgrading the
>> EC2 machine type to r3.xlarge (4 CPU, 30 GB RAM). Results were like this:
>>
>> Time takes in returning result of query on row key:
>> With Storm running and very less RAM available: *50 sec*
>>
>> With Storm stopped and RAM available to Phoenix and HBase: *18 sec*
>>
>> With new machine of next higher category (4 CPU and 30 GB RAM): *8 sec*
>>
>> Pure HBase query by row key with Storm stopped and (2 CPU, 15 GB RAM): *0.0150
>> seconds*. :)
>>
>> So, the difference seems to be many fold of what native HBase is
>> providing to us. I am not able to understand how it can be possible? What I
>> am missing here?
>>
>> --
>> Regards,
>> Vikas Agarwal
>> 91 – 9928301411
>>
>> InfoObjects, Inc.
>> Execution Matters
>> http://www.infoobjects.com
>> 2041 Mission College Boulevard, #280
>> Santa Clara, CA 95054
>> +1 (408) 988-2000 Work
>> +1 (408) 716-2726 Fax
>>
>>
>

Re: Phoenix response time

Posted by Vikas Agarwal <vi...@infoobjects.com>.
We are generating DDL using Java string concatenation and a table is
created on the topology submission. I missed to copy the exact statement,
so it would take me some time to get it again  because I have to resubmit
the topology.


On Sat, Sep 6, 2014 at 10:50 AM, James Taylor <ja...@apache.org>
wrote:

> Would you mind posting the CREATE TABLE statement you used to create
> the table, as it's a little easier to read?
> Thanks,
> James
>
> On Fri, Sep 5, 2014 at 10:19 PM, Vikas Agarwal <vi...@infoobjects.com>
> wrote:
> > James,
> >
> > Schema is pretty simple, I guess. Here it is (I have renamed some actual
> > column names)
> >
> > TABLE_CAT  | TABLE_SCHEM | TABLE_NAME | COLUMN_NAME | DATA_TYPE  |
> TYPE_NAME
> > | COLUMN_SIZE | BUFFER_LENGTH | DECIMAL_DIGITS | NUM_PREC_RADIX |
> NULLABLE
> > | COLUMN_DEF | SQ |
> >
> +------------+-------------+------------+-------------+------------+------------+-------------+---------------+----------------+----------------+------------+------------+----+
> > | null       | null        | table_name | TIMESTAMP   | -5         |
> BIGINT
> > | null        | null          | null           | null           | 0
> > | null      |
> > | null       | null        | table_name | ID          | 12         |
> VARCHAR
> > | 255         | null          | null           | null           | 0
> > | null      |
> > | null       | null        | table_name | TEXT_FIELD  | 12         |
> VARCHAR
> > | 255         | null          | null           | null           | 1
> > | null      |
> > | null       | null        | table_name | USER_ID     | 12         |
> VARCHAR
> > | 255         | null          | null           | null           | 0
> > | null      |
> > | null       | null        | table_name | TEXT_FIELD        | 12
>  |
> > VARCHAR    | 25523       | null          | null           | null
>    |
> > 1          | null      |
> > | null       | null        | table_name | TYPE        | 12         |
> VARCHAR
> > | 255         | null          | null           | null           | 1
> > | null      |
> > | null       | null        | table_name | COUNT_1 | 4          | INTEGER
> > | null        | null          | null           | null           | 1
> > | null   |
> > | null       | null        | table_name | COUNT_2 | 4          | INTEGER
> > | null        | null          | null           | null           | 1
> > | null  |
> > | null       | null        | table_name | COUNT_3 | 4          | INTEGER
> > | null        | null          | null           | null           | 1
> > | null    |
> > | null       | null        | table_name | COUNT_4 | -5         | BIGINT
> > | null        | null          | null           | null           | 1
> > | null      |
> > | null       | null        | table_name | COUNT_5  | -5         | BIGINT
> > | null        | null          | null           | null           | 1
> > | null      |
> > | null       | null        | table_name | COUNT_6  | -5         | BIGINT
> > | null        | null          | null           | null           | 1
> > | null      |
> > | null       | null        | table_name | TAGS        | 2003       |
> > VARCHAR_ARRAY | null        | null          | null           | null
> > | 1          | null   |
> > | null       | null        | table_name | UPDATED     | -5         |
> BIGINT
> > | null        | null          | null           | null           | 1
> > | null      |
> > | null       | null        | table_name | SOME_FIELD     | 12         |
> > VARCHAR    | 255         | null          | null           | null
>    |
> > 1          | null      |
> > | null       | null        | table_name | LOCATIONS | 12         |
> VARCHAR
> > | 255         | null          | null           | null           | 1
> > | null    |
> >
> +------------+-------------+------------+-------------+------------+------------+-------------+---------------+----------------+----------------+------------+------------+----+
> >
> >
> > Query:
> >
> > SELECT USER_ID FROM HJK_SI_LEAD_FEED WHERE ID='507449491025170432';
> >
> >
> >
> >
> >
> > On Sat, Sep 6, 2014 at 10:15 AM, James Taylor <ja...@apache.org>
> > wrote:
> >>
> >> Vikas,
> >> Please post your schema and query.
> >> Thanks,
> >> James
> >>
> >> On Fri, Sep 5, 2014 at 9:18 PM, Vikas Agarwal <vi...@infoobjects.com>
> >> wrote:
> >> > Ours is also a single node setup right now and as of now there are
> less
> >> > than
> >> > 1 million rows which is expected to grow around 100m at minimum.
> >> >
> >> > I am aware of secondary indexes but when I am querying on primary/row
> >> > key,
> >> > why would it take so much time?
> >> >
> >> > I am directly querying using sqlline for Phoenix and hbase shell for
> >> > HBase
> >> > query. I am not expecting to do any fine tuning for such small
> dataset.
> >> > I am
> >> > assumimg a minimum performance level out of the box.
> >> >
> >> > On Friday, September 5, 2014, yeshwanth kumar <ye...@gmail.com>
> >> > wrote:
> >> >>
> >> >> hi vikas,
> >> >>
> >> >> we used phoenix on a 4 core/23Gb machine, as a single node setup.
> >> >> used HDP 2.1
> >> >> our table has 50-70M rows,
> >> >> select on that table took less than 2 seconds.
> >> >> Aggregation queries took less than 8 seconds.
> >> >> for achieving good performance we created secondary index on the
> table.
> >> >>
> >> >> make sure you finetuned hbase,
> >> >> enabling compression on the data makes a difference in response.
> >> >> if u distribute the data and load over all regions in hbase,
> >> >> look at the performance tips mentioned in phoenix blog
> >> >>
> >> >> -yeshwanth
> >> >>
> >> >>
> >> >>
> >> >> Cheers,
> >> >> Yeshwanth
> >> >>
> >> >>
> >> >>
> >> >> On Fri, Sep 5, 2014 at 5:42 PM, Vikas Agarwal <vikas@infoobjects.com
> >
> >> >> wrote:
> >> >>>
> >> >>> Hi,
> >> >>>
> >> >>> Preface: We are testing phoenix using Hortonworks distribution for
> >> >>> HBase
> >> >>> on Amazon EC2 instance (r3.large, 2 CPU/15 GB RAM).
> >> >>>
> >> >>> With contrast to performance benchmarks, I found Phoenix to be very
> >> >>> slow
> >> >>> in querying even on primary key or row key. So, tried to increase
> the
> >> >>> RAM
> >> >>> for HBase and Phoenix and increasing the CPU and RAM by upgrading
> the
> >> >>> EC2
> >> >>> machine type to r3.xlarge (4 CPU, 30 GB RAM). Results were like
> this:
> >> >>>
> >> >>> Time takes in returning result of query on row key:
> >> >>> With Storm running and very less RAM available: 50 sec
> >> >>>
> >> >>> With Storm stopped and RAM available to Phoenix and HBase: 18 sec
> >> >>>
> >> >>> With new machine of next higher category (4 CPU and 30 GB RAM): 8
> sec
> >> >>>
> >> >>> Pure HBase query by row key with Storm stopped and (2 CPU, 15 GB
> RAM):
> >> >>> 0.0150 seconds. :)
> >> >>>
> >> >>> So, the difference seems to be many fold of what native HBase is
> >> >>> providing to us. I am not able to understand how it can be possible?
> >> >>> What I
> >> >>> am missing here?
> >> >>>
> >> >>> --
> >> >>> Regards,
> >> >>> Vikas Agarwal
> >> >>> 91 – 9928301411
> >> >>>
> >> >>> InfoObjects, Inc.
> >> >>> Execution Matters
> >> >>> http://www.infoobjects.com
> >> >>> 2041 Mission College Boulevard, #280
> >> >>> Santa Clara, CA 95054
> >> >>> +1 (408) 988-2000 Work
> >> >>> +1 (408) 716-2726 Fax
> >> >>
> >> >>
> >> >
> >> >
> >> > --
> >> > Regards,
> >> > Vikas Agarwal
> >> > 91 – 9928301411
> >> >
> >> > InfoObjects, Inc.
> >> > Execution Matters
> >> > http://www.infoobjects.com
> >> > 2041 Mission College Boulevard, #280
> >> > Santa Clara, CA 95054
> >> > +1 (408) 988-2000 Work
> >> > +1 (408) 716-2726 Fax
> >> >
> >> >
> >
> >
> >
> >
> > --
> > Regards,
> > Vikas Agarwal
> > 91 – 9928301411
> >
> > InfoObjects, Inc.
> > Execution Matters
> > http://www.infoobjects.com
> > 2041 Mission College Boulevard, #280
> > Santa Clara, CA 95054
> > +1 (408) 988-2000 Work
> > +1 (408) 716-2726 Fax
>



-- 
Regards,
Vikas Agarwal
91 – 9928301411

InfoObjects, Inc.
Execution Matters
http://www.infoobjects.com
2041 Mission College Boulevard, #280
Santa Clara, CA 95054
+1 (408) 988-2000 Work
+1 (408) 716-2726 Fax

Re: Phoenix response time

Posted by James Taylor <ja...@apache.org>.
Would you mind posting the CREATE TABLE statement you used to create
the table, as it's a little easier to read?
Thanks,
James

On Fri, Sep 5, 2014 at 10:19 PM, Vikas Agarwal <vi...@infoobjects.com> wrote:
> James,
>
> Schema is pretty simple, I guess. Here it is (I have renamed some actual
> column names)
>
> TABLE_CAT  | TABLE_SCHEM | TABLE_NAME | COLUMN_NAME | DATA_TYPE  | TYPE_NAME
> | COLUMN_SIZE | BUFFER_LENGTH | DECIMAL_DIGITS | NUM_PREC_RADIX |  NULLABLE
> | COLUMN_DEF | SQ |
> +------------+-------------+------------+-------------+------------+------------+-------------+---------------+----------------+----------------+------------+------------+----+
> | null       | null        | table_name | TIMESTAMP   | -5         | BIGINT
> | null        | null          | null           | null           | 0
> | null      |
> | null       | null        | table_name | ID          | 12         | VARCHAR
> | 255         | null          | null           | null           | 0
> | null      |
> | null       | null        | table_name | TEXT_FIELD  | 12         | VARCHAR
> | 255         | null          | null           | null           | 1
> | null      |
> | null       | null        | table_name | USER_ID     | 12         | VARCHAR
> | 255         | null          | null           | null           | 0
> | null      |
> | null       | null        | table_name | TEXT_FIELD        | 12         |
> VARCHAR    | 25523       | null          | null           | null           |
> 1          | null      |
> | null       | null        | table_name | TYPE        | 12         | VARCHAR
> | 255         | null          | null           | null           | 1
> | null      |
> | null       | null        | table_name | COUNT_1 | 4          | INTEGER
> | null        | null          | null           | null           | 1
> | null   |
> | null       | null        | table_name | COUNT_2 | 4          | INTEGER
> | null        | null          | null           | null           | 1
> | null  |
> | null       | null        | table_name | COUNT_3 | 4          | INTEGER
> | null        | null          | null           | null           | 1
> | null    |
> | null       | null        | table_name | COUNT_4 | -5         | BIGINT
> | null        | null          | null           | null           | 1
> | null      |
> | null       | null        | table_name | COUNT_5  | -5         | BIGINT
> | null        | null          | null           | null           | 1
> | null      |
> | null       | null        | table_name | COUNT_6  | -5         | BIGINT
> | null        | null          | null           | null           | 1
> | null      |
> | null       | null        | table_name | TAGS        | 2003       |
> VARCHAR_ARRAY | null        | null          | null           | null
> | 1          | null   |
> | null       | null        | table_name | UPDATED     | -5         | BIGINT
> | null        | null          | null           | null           | 1
> | null      |
> | null       | null        | table_name | SOME_FIELD     | 12         |
> VARCHAR    | 255         | null          | null           | null           |
> 1          | null      |
> | null       | null        | table_name | LOCATIONS | 12         | VARCHAR
> | 255         | null          | null           | null           | 1
> | null    |
> +------------+-------------+------------+-------------+------------+------------+-------------+---------------+----------------+----------------+------------+------------+----+
>
>
> Query:
>
> SELECT USER_ID FROM HJK_SI_LEAD_FEED WHERE ID='507449491025170432';
>
>
>
>
>
> On Sat, Sep 6, 2014 at 10:15 AM, James Taylor <ja...@apache.org>
> wrote:
>>
>> Vikas,
>> Please post your schema and query.
>> Thanks,
>> James
>>
>> On Fri, Sep 5, 2014 at 9:18 PM, Vikas Agarwal <vi...@infoobjects.com>
>> wrote:
>> > Ours is also a single node setup right now and as of now there are less
>> > than
>> > 1 million rows which is expected to grow around 100m at minimum.
>> >
>> > I am aware of secondary indexes but when I am querying on primary/row
>> > key,
>> > why would it take so much time?
>> >
>> > I am directly querying using sqlline for Phoenix and hbase shell for
>> > HBase
>> > query. I am not expecting to do any fine tuning for such small dataset.
>> > I am
>> > assumimg a minimum performance level out of the box.
>> >
>> > On Friday, September 5, 2014, yeshwanth kumar <ye...@gmail.com>
>> > wrote:
>> >>
>> >> hi vikas,
>> >>
>> >> we used phoenix on a 4 core/23Gb machine, as a single node setup.
>> >> used HDP 2.1
>> >> our table has 50-70M rows,
>> >> select on that table took less than 2 seconds.
>> >> Aggregation queries took less than 8 seconds.
>> >> for achieving good performance we created secondary index on the table.
>> >>
>> >> make sure you finetuned hbase,
>> >> enabling compression on the data makes a difference in response.
>> >> if u distribute the data and load over all regions in hbase,
>> >> look at the performance tips mentioned in phoenix blog
>> >>
>> >> -yeshwanth
>> >>
>> >>
>> >>
>> >> Cheers,
>> >> Yeshwanth
>> >>
>> >>
>> >>
>> >> On Fri, Sep 5, 2014 at 5:42 PM, Vikas Agarwal <vi...@infoobjects.com>
>> >> wrote:
>> >>>
>> >>> Hi,
>> >>>
>> >>> Preface: We are testing phoenix using Hortonworks distribution for
>> >>> HBase
>> >>> on Amazon EC2 instance (r3.large, 2 CPU/15 GB RAM).
>> >>>
>> >>> With contrast to performance benchmarks, I found Phoenix to be very
>> >>> slow
>> >>> in querying even on primary key or row key. So, tried to increase the
>> >>> RAM
>> >>> for HBase and Phoenix and increasing the CPU and RAM by upgrading the
>> >>> EC2
>> >>> machine type to r3.xlarge (4 CPU, 30 GB RAM). Results were like this:
>> >>>
>> >>> Time takes in returning result of query on row key:
>> >>> With Storm running and very less RAM available: 50 sec
>> >>>
>> >>> With Storm stopped and RAM available to Phoenix and HBase: 18 sec
>> >>>
>> >>> With new machine of next higher category (4 CPU and 30 GB RAM): 8 sec
>> >>>
>> >>> Pure HBase query by row key with Storm stopped and (2 CPU, 15 GB RAM):
>> >>> 0.0150 seconds. :)
>> >>>
>> >>> So, the difference seems to be many fold of what native HBase is
>> >>> providing to us. I am not able to understand how it can be possible?
>> >>> What I
>> >>> am missing here?
>> >>>
>> >>> --
>> >>> Regards,
>> >>> Vikas Agarwal
>> >>> 91 – 9928301411
>> >>>
>> >>> InfoObjects, Inc.
>> >>> Execution Matters
>> >>> http://www.infoobjects.com
>> >>> 2041 Mission College Boulevard, #280
>> >>> Santa Clara, CA 95054
>> >>> +1 (408) 988-2000 Work
>> >>> +1 (408) 716-2726 Fax
>> >>
>> >>
>> >
>> >
>> > --
>> > Regards,
>> > Vikas Agarwal
>> > 91 – 9928301411
>> >
>> > InfoObjects, Inc.
>> > Execution Matters
>> > http://www.infoobjects.com
>> > 2041 Mission College Boulevard, #280
>> > Santa Clara, CA 95054
>> > +1 (408) 988-2000 Work
>> > +1 (408) 716-2726 Fax
>> >
>> >
>
>
>
>
> --
> Regards,
> Vikas Agarwal
> 91 – 9928301411
>
> InfoObjects, Inc.
> Execution Matters
> http://www.infoobjects.com
> 2041 Mission College Boulevard, #280
> Santa Clara, CA 95054
> +1 (408) 988-2000 Work
> +1 (408) 716-2726 Fax

Re: Phoenix response time

Posted by Vikas Agarwal <vi...@infoobjects.com>.
James,

Schema is pretty simple, I guess. Here it is (I have renamed some actual
column names)

TABLE_CAT  | TABLE_SCHEM | TABLE_NAME | COLUMN_NAME | DATA_TYPE  |
TYPE_NAME  | COLUMN_SIZE | BUFFER_LENGTH | DECIMAL_DIGITS | NUM_PREC_RADIX
|  NULLABLE  | COLUMN_DEF | SQ |
+------------+-------------+------------+-------------+------------+------------+-------------+---------------+----------------+----------------+------------+------------+----+
| null       | null        | table_name | TIMESTAMP   | -5         | BIGINT
    | null        | null          | null           | null           | 0
     | null      |
| null       | null        | table_name | ID          | 12         |
VARCHAR    | 255         | null          | null           | null
| 0          | null      |
| null       | null        | table_name | TEXT_FIELD  | 12         |
VARCHAR    | 255         | null          | null           | null
| 1          | null      |
| null       | null        | table_name | USER_ID     | 12         |
VARCHAR    | 255         | null          | null           | null
| 0          | null      |
| null       | null        | table_name | TEXT_FIELD        | 12         |
VARCHAR    | 25523       | null          | null           | null
| 1          | null      |
| null       | null        | table_name | TYPE        | 12         |
VARCHAR    | 255         | null          | null           | null
| 1          | null      |
| null       | null        | table_name | COUNT_1 | 4          | INTEGER
 | null        | null          | null           | null           | 1
   | null   |
| null       | null        | table_name | COUNT_2 | 4          | INTEGER
 | null        | null          | null           | null           | 1
   | null  |
| null       | null        | table_name | COUNT_3 | 4          | INTEGER
 | null        | null          | null           | null           | 1
   | null    |
| null       | null        | table_name | COUNT_4 | -5         | BIGINT
| null        | null          | null           | null           | 1
 | null      |
| null       | null        | table_name | COUNT_5  | -5         | BIGINT
  | null        | null          | null           | null           | 1
   | null      |
| null       | null        | table_name | COUNT_6  | -5         | BIGINT
  | null        | null          | null           | null           | 1
   | null      |
| null       | null        | table_name | TAGS        | 2003       |
VARCHAR_ARRAY | null        | null          | null           | null
  | 1          | null   |
| null       | null        | table_name | UPDATED     | -5         | BIGINT
    | null        | null          | null           | null           | 1
     | null      |
| null       | null        | table_name | SOME_FIELD     | 12         |
VARCHAR    | 255         | null          | null           | null
| 1          | null      |
| null       | null        | table_name | LOCATIONS | 12         | VARCHAR
   | 255         | null          | null           | null           | 1
     | null    |
+------------+-------------+------------+-------------+------------+------------+-------------+---------------+----------------+----------------+------------+------------+----+


*Query:*

SELECT USER_ID FROM HJK_SI_LEAD_FEED WHERE ID='507449491025170432';




On Sat, Sep 6, 2014 at 10:15 AM, James Taylor <ja...@apache.org>
wrote:

> Vikas,
> Please post your schema and query.
> Thanks,
> James
>
> On Fri, Sep 5, 2014 at 9:18 PM, Vikas Agarwal <vi...@infoobjects.com>
> wrote:
> > Ours is also a single node setup right now and as of now there are less
> than
> > 1 million rows which is expected to grow around 100m at minimum.
> >
> > I am aware of secondary indexes but when I am querying on primary/row
> key,
> > why would it take so much time?
> >
> > I am directly querying using sqlline for Phoenix and hbase shell for
> HBase
> > query. I am not expecting to do any fine tuning for such small dataset.
> I am
> > assumimg a minimum performance level out of the box.
> >
> > On Friday, September 5, 2014, yeshwanth kumar <ye...@gmail.com>
> wrote:
> >>
> >> hi vikas,
> >>
> >> we used phoenix on a 4 core/23Gb machine, as a single node setup.
> >> used HDP 2.1
> >> our table has 50-70M rows,
> >> select on that table took less than 2 seconds.
> >> Aggregation queries took less than 8 seconds.
> >> for achieving good performance we created secondary index on the table.
> >>
> >> make sure you finetuned hbase,
> >> enabling compression on the data makes a difference in response.
> >> if u distribute the data and load over all regions in hbase,
> >> look at the performance tips mentioned in phoenix blog
> >>
> >> -yeshwanth
> >>
> >>
> >>
> >> Cheers,
> >> Yeshwanth
> >>
> >>
> >>
> >> On Fri, Sep 5, 2014 at 5:42 PM, Vikas Agarwal <vi...@infoobjects.com>
> >> wrote:
> >>>
> >>> Hi,
> >>>
> >>> Preface: We are testing phoenix using Hortonworks distribution for
> HBase
> >>> on Amazon EC2 instance (r3.large, 2 CPU/15 GB RAM).
> >>>
> >>> With contrast to performance benchmarks, I found Phoenix to be very
> slow
> >>> in querying even on primary key or row key. So, tried to increase the
> RAM
> >>> for HBase and Phoenix and increasing the CPU and RAM by upgrading the
> EC2
> >>> machine type to r3.xlarge (4 CPU, 30 GB RAM). Results were like this:
> >>>
> >>> Time takes in returning result of query on row key:
> >>> With Storm running and very less RAM available: 50 sec
> >>>
> >>> With Storm stopped and RAM available to Phoenix and HBase: 18 sec
> >>>
> >>> With new machine of next higher category (4 CPU and 30 GB RAM): 8 sec
> >>>
> >>> Pure HBase query by row key with Storm stopped and (2 CPU, 15 GB RAM):
> >>> 0.0150 seconds. :)
> >>>
> >>> So, the difference seems to be many fold of what native HBase is
> >>> providing to us. I am not able to understand how it can be possible?
> What I
> >>> am missing here?
> >>>
> >>> --
> >>> Regards,
> >>> Vikas Agarwal
> >>> 91 – 9928301411
> >>>
> >>> InfoObjects, Inc.
> >>> Execution Matters
> >>> http://www.infoobjects.com
> >>> 2041 Mission College Boulevard, #280
> >>> Santa Clara, CA 95054
> >>> +1 (408) 988-2000 Work
> >>> +1 (408) 716-2726 Fax
> >>
> >>
> >
> >
> > --
> > Regards,
> > Vikas Agarwal
> > 91 – 9928301411
> >
> > InfoObjects, Inc.
> > Execution Matters
> > http://www.infoobjects.com
> > 2041 Mission College Boulevard, #280
> > Santa Clara, CA 95054
> > +1 (408) 988-2000 Work
> > +1 (408) 716-2726 Fax
> >
> >
>



-- 
Regards,
Vikas Agarwal
91 – 9928301411

InfoObjects, Inc.
Execution Matters
http://www.infoobjects.com
2041 Mission College Boulevard, #280
Santa Clara, CA 95054
+1 (408) 988-2000 Work
+1 (408) 716-2726 Fax

Re: Phoenix response time

Posted by Vikas Agarwal <vi...@infoobjects.com>.
It was not issue with Amazon at all. Amazon can't slow down things this
much. :)

It was issue from our side only. Earlier we had two columns as primary key
timestamp and id fields and recently we changed it to single be on single
field id. Phoenix table was not updated to reflect the same. :)

Now new question, do I need to drop the table to do this primary key
change? I guess yes because behind the scenes HBase has already created row
keys using two fields. Still, just to confirm and to get new ideas for not
loosing the existing data.

Thanks all for the help. :)


On Sat, Sep 6, 2014 at 10:33 PM, James Taylor <ja...@apache.org>
wrote:

> I don't have experience running Phoenix in AWS. Andrew Purtell is a
> good person to ask. I'm curious if our support under their EMR helps
> in any way: http://phoenix.apache.org/phoenix_on_emr.html
>
> Thanks,
> James
>
> On Sat, Sep 6, 2014 at 12:27 AM, Alex Kamil <al...@gmail.com> wrote:
> > not sure, that's not my experience with phoenix, but if you have unstable
> > network connection to your storage (which is EBS is well known for) it
> may
> > affect the results
> >
> >
> > On Sat, Sep 6, 2014 at 3:14 AM, Vikas Agarwal <vi...@infoobjects.com>
> wrote:
> >>
> >> Of course, I can do a lot of optimizations. However, my concern is that
> >> what I am missing that is causing Phoenix to perform bad while exactly
> on
> >> same time, Hbase is giving results amazingly fast.
> >>
> >>
> >> On Sat, Sep 6, 2014 at 12:41 PM, Alex Kamil <al...@gmail.com>
> wrote:
> >>>
> >>> well it is still network attached, If you allocate enough heap to fit
> the
> >>> whole thing in memory (in hbase/conf/hbase-env.sh) you could probably
> >>> eliminate this as a possible reason
> >>>
> >>>
> >>> On Sat, Sep 6, 2014 at 2:43 AM, Vikas Agarwal <vi...@infoobjects.com>
> >>> wrote:
> >>>>
> >>>> EBS but with new generation SSD not magnetic one.
> >>>>
> >>>>
> >>>> On Sat, Sep 6, 2014 at 12:11 PM, Alex Kamil <al...@gmail.com>
> >>>> wrote:
> >>>>>
> >>>>> do you use EBS or ephemeral storage, I found EBS performance to be
> >>>>> somewhat unpredictable
> >>>>>
> >>>>>
> >>>>> On Sat, Sep 6, 2014 at 2:37 AM, Vikas Agarwal <vikas@infoobjects.com
> >
> >>>>> wrote:
> >>>>>>
> >>>>>> Hbase is 0.98.0
> >>>>>> Phoenix is 4.0
> >>>>>>
> >>>>>>
> >>>>>> On Sat, Sep 6, 2014 at 12:04 PM, Vikas Agarwal <
> vikas@infoobjects.com>
> >>>>>> wrote:
> >>>>>>>
> >>>>>>> Yes, that is why it is a trouble for me. However, on contrary,
> HBase
> >>>>>>> shell is also on the same machine and same environment, so if it
> is an issue
> >>>>>>> of resource (CPU or memory) it should have affected the HBase too,
> but HBase
> >>>>>>> is able to give me results within 0.0150 seconds. :(
> >>>>>>>
> >>>>>>> No, I haven't tested it outside AWS. I guess, it should not be the
> >>>>>>> case due to much better performance by native HBase query on HBase
> shell.
> >>>>>>>
> >>>>>>>
> >>>>>>> On Sat, Sep 6, 2014 at 11:59 AM, James Taylor
> >>>>>>> <ja...@apache.org> wrote:
> >>>>>>>>
> >>>>>>>> Something is up in your environment. What version of Phoenix and
> >>>>>>>> HBase
> >>>>>>>> are you using and in what environment? Have you tried this
> locally,
> >>>>>>>> outside of AWS to compare?
> >>>>>>>>
> >>>>>>>> Take a look at our perf numbers, generated more-or-less daily, and
> >>>>>>>> which run over more data that what you're testing against:
> >>>>>>>>
> >>>>>>>>
> http://phoenix-bin.github.io/client/performance/phoenix-20140904095313.htm
> >>>>>>>>
> >>>>>>>> Some of these are point queries and they take in the neighborhood
> of
> >>>>>>>> 0.01 seconds.
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>> James
> >>>>>>>>
> >>>>>>>> On Fri, Sep 5, 2014 at 10:48 PM, Vikas Agarwal
> >>>>>>>> <vi...@infoobjects.com> wrote:
> >>>>>>>> > Missed to mention that count query (posted in my last mail) is
> >>>>>>>> > also taking
> >>>>>>>> > very long time to return the count.
> >>>>>>>> >
> >>>>>>>> >
> >>>>>>>> > On Sat, Sep 6, 2014 at 11:17 AM, Vikas Agarwal
> >>>>>>>> > <vi...@infoobjects.com>
> >>>>>>>> > wrote:
> >>>>>>>> >>
> >>>>>>>> >> As I mentioned, schema is nothing but bunch of fields (some
> being
> >>>>>>>> >> integers, longs and text) along with primary key (row key) and
> I
> >>>>>>>> >> am making
> >>>>>>>> >> simple query to get result for a particular primary key,
> nothing
> >>>>>>>> >> more than
> >>>>>>>> >> that.
> >>>>>>>> >>
> >>>>>>>> >> 0: jdbc:phoenix:localhost> SELECT count(1) FROM table_name;
> >>>>>>>> >>
> >>>>>>>> >> +------------+
> >>>>>>>> >>
> >>>>>>>> >> |  COUNT(1)  |
> >>>>>>>> >>
> >>>>>>>> >> +------------+
> >>>>>>>> >>
> >>>>>>>> >> | 4667515    |
> >>>>>>>> >>
> >>>>>>>> >> +------------+
> >>>>>>>> >>
> >>>>>>>> >> 1 row selected (132.11 seconds)
> >>>>>>>> >>
> >>>>>>>> >>
> >>>>>>>> >>
> >>>>>>>> >> On Sat, Sep 6, 2014 at 11:09 AM, Puneet Kumar Ojha
> >>>>>>>> >> <pu...@pubmatic.com> wrote:
> >>>>>>>> >>>
> >>>>>>>> >>> If you can share the schema,data type,cardinality of each
> >>>>>>>> >>> dimension and
> >>>>>>>> >>> usual queries, I can help to design a schema with performance
> of
> >>>>>>>> >>> less than 1
> >>>>>>>> >>> sec using Phoenix.
> >>>>>>>> >>>
> >>>>>>>> >>>
> >>>>>>>> >>>
> >>>>>>>> >>> Thanks
> >>>>>>>> >>>
> >>>>>>>> >>>
> >>>>>>>> >>>
> >>>>>>>> >>>
> >>>>>>>> >>>
> >>>>>>>> >>> ------ Original message------
> >>>>>>>> >>>
> >>>>>>>> >>> From: James Taylor
> >>>>>>>> >>>
> >>>>>>>> >>> Date: Sat, Sep 6, 2014 10:15 AM
> >>>>>>>> >>>
> >>>>>>>> >>> To: user;
> >>>>>>>> >>>
> >>>>>>>> >>> Subject:Re: Phoenix response time
> >>>>>>>> >>>
> >>>>>>>> >>>
> >>>>>>>> >>>
> >>>>>>>> >>> Vikas,
> >>>>>>>> >>> Please post your schema and query.
> >>>>>>>> >>> Thanks,
> >>>>>>>> >>> James
> >>>>>>>> >>>
> >>>>>>>> >>> On Fri, Sep 5, 2014 at 9:18 PM, Vikas Agarwal
> >>>>>>>> >>> <vi...@infoobjects.com>
> >>>>>>>> >>> wrote:
> >>>>>>>> >>> > Ours is also a single node setup right now and as of now
> there
> >>>>>>>> >>> > are less
> >>>>>>>> >>> > than
> >>>>>>>> >>> > 1 million rows which is expected to grow around 100m at
> >>>>>>>> >>> > minimum.
> >>>>>>>> >>> >
> >>>>>>>> >>> > I am aware of secondary indexes but when I am querying on
> >>>>>>>> >>> > primary/row
> >>>>>>>> >>> > key,
> >>>>>>>> >>> > why would it take so much time?
> >>>>>>>> >>> >
> >>>>>>>> >>> > I am directly querying using sqlline for Phoenix and hbase
> >>>>>>>> >>> > shell for
> >>>>>>>> >>> > HBase
> >>>>>>>> >>> > query. I am not expecting to do any fine tuning for such
> small
> >>>>>>>> >>> > dataset.
> >>>>>>>> >>> > I am
> >>>>>>>> >>> > assumimg a minimum performance level out of the box.
> >>>>>>>> >>> >
> >>>>>>>> >>> > On Friday, September 5, 2014, yeshwanth kumar
> >>>>>>>> >>> > <ye...@gmail.com>
> >>>>>>>> >>> > wrote:
> >>>>>>>> >>> >>
> >>>>>>>> >>> >> hi vikas,
> >>>>>>>> >>> >>
> >>>>>>>> >>> >> we used phoenix on a 4 core/23Gb machine, as a single node
> >>>>>>>> >>> >> setup.
> >>>>>>>> >>> >> used HDP 2.1
> >>>>>>>> >>> >> our table has 50-70M rows,
> >>>>>>>> >>> >> select on that table took less than 2 seconds.
> >>>>>>>> >>> >> Aggregation queries took less than 8 seconds.
> >>>>>>>> >>> >> for achieving good performance we created secondary index
> on
> >>>>>>>> >>> >> the
> >>>>>>>> >>> >> table.
> >>>>>>>> >>> >>
> >>>>>>>> >>> >> make sure you finetuned hbase,
> >>>>>>>> >>> >> enabling compression on the data makes a difference in
> >>>>>>>> >>> >> response.
> >>>>>>>> >>> >> if u distribute the data and load over all regions in
> hbase,
> >>>>>>>> >>> >> look at the performance tips mentioned in phoenix blog
> >>>>>>>> >>> >>
> >>>>>>>> >>> >> -yeshwanth
> >>>>>>>> >>> >>
> >>>>>>>> >>> >>
> >>>>>>>> >>> >>
> >>>>>>>> >>> >> Cheers,
> >>>>>>>> >>> >> Yeshwanth
> >>>>>>>> >>> >>
> >>>>>>>> >>> >>
> >>>>>>>> >>> >>
> >>>>>>>> >>> >> On Fri, Sep 5, 2014 at 5:42 PM, Vikas Agarwal
> >>>>>>>> >>> >> <vi...@infoobjects.com>
> >>>>>>>> >>> >> wrote:
> >>>>>>>> >>> >>>
> >>>>>>>> >>> >>> Hi,
> >>>>>>>> >>> >>>
> >>>>>>>> >>> >>> Preface: We are testing phoenix using Hortonworks
> >>>>>>>> >>> >>> distribution for
> >>>>>>>> >>> >>> HBase
> >>>>>>>> >>> >>> on Amazon EC2 instance (r3.large, 2 CPU/15 GB RAM).
> >>>>>>>> >>> >>>
> >>>>>>>> >>> >>> With contrast to performance benchmarks, I found Phoenix
> to
> >>>>>>>> >>> >>> be very
> >>>>>>>> >>> >>> slow
> >>>>>>>> >>> >>> in querying even on primary key or row key. So, tried to
> >>>>>>>> >>> >>> increase the
> >>>>>>>> >>> >>> RAM
> >>>>>>>> >>> >>> for HBase and Phoenix and increasing the CPU and RAM by
> >>>>>>>> >>> >>> upgrading the
> >>>>>>>> >>> >>> EC2
> >>>>>>>> >>> >>> machine type to r3.xlarge (4 CPU, 30 GB RAM). Results were
> >>>>>>>> >>> >>> like this:
> >>>>>>>> >>> >>>
> >>>>>>>> >>> >>> Time takes in returning result of query on row key:
> >>>>>>>> >>> >>> With Storm running and very less RAM available: 50 sec
> >>>>>>>> >>> >>>
> >>>>>>>> >>> >>> With Storm stopped and RAM available to Phoenix and HBase:
> >>>>>>>> >>> >>> 18 sec
> >>>>>>>> >>> >>>
> >>>>>>>> >>> >>> With new machine of next higher category (4 CPU and 30 GB
> >>>>>>>> >>> >>> RAM): 8 sec
> >>>>>>>> >>> >>>
> >>>>>>>> >>> >>> Pure HBase query by row key with Storm stopped and (2 CPU,
> >>>>>>>> >>> >>> 15 GB
> >>>>>>>> >>> >>> RAM):
> >>>>>>>> >>> >>> 0.0150 seconds. :)
> >>>>>>>> >>> >>>
> >>>>>>>> >>> >>> So, the difference seems to be many fold of what native
> >>>>>>>> >>> >>> HBase is
> >>>>>>>> >>> >>> providing to us. I am not able to understand how it can be
> >>>>>>>> >>> >>> possible?
> >>>>>>>> >>> >>> What I
> >>>>>>>> >>> >>> am missing here?
> >>>>>>>> >>> >>>
> >>>>>>>> >>> >>> --
> >>>>>>>> >>> >>> Regards,
> >>>>>>>> >>> >>> Vikas Agarwal
> >>>>>>>> >>> >>> 91 – 9928301411
> >>>>>>>> >>> >>>
> >>>>>>>> >>> >>> InfoObjects, Inc.
> >>>>>>>> >>> >>> Execution Matters
> >>>>>>>> >>> >>> http://www.infoobjects.com
> >>>>>>>> >>> >>> 2041 Mission College Boulevard, #280
> >>>>>>>> >>> >>> Santa Clara, CA 95054
> >>>>>>>> >>> >>> +1 (408) 988-2000 Work
> >>>>>>>> >>> >>> +1 (408) 716-2726 Fax
> >>>>>>>> >>> >>
> >>>>>>>> >>> >>
> >>>>>>>> >>> >
> >>>>>>>> >>> >
> >>>>>>>> >>> > --
> >>>>>>>> >>> > Regards,
> >>>>>>>> >>> > Vikas Agarwal
> >>>>>>>> >>> > 91 – 9928301411
> >>>>>>>> >>> >
> >>>>>>>> >>> > InfoObjects, Inc.
> >>>>>>>> >>> > Execution Matters
> >>>>>>>> >>> > http://www.infoobjects.com
> >>>>>>>> >>> > 2041 Mission College Boulevard, #280
> >>>>>>>> >>> > Santa Clara, CA 95054
> >>>>>>>> >>> > +1 (408) 988-2000 Work
> >>>>>>>> >>> > +1 (408) 716-2726 Fax
> >>>>>>>> >>> >
> >>>>>>>> >>> >
> >>>>>>>> >>
> >>>>>>>> >>
> >>>>>>>> >>
> >>>>>>>> >>
> >>>>>>>> >> --
> >>>>>>>> >> Regards,
> >>>>>>>> >> Vikas Agarwal
> >>>>>>>> >> 91 – 9928301411
> >>>>>>>> >>
> >>>>>>>> >> InfoObjects, Inc.
> >>>>>>>> >> Execution Matters
> >>>>>>>> >> http://www.infoobjects.com
> >>>>>>>> >> 2041 Mission College Boulevard, #280
> >>>>>>>> >> Santa Clara, CA 95054
> >>>>>>>> >> +1 (408) 988-2000 Work
> >>>>>>>> >> +1 (408) 716-2726 Fax
> >>>>>>>> >
> >>>>>>>> >
> >>>>>>>> >
> >>>>>>>> >
> >>>>>>>> > --
> >>>>>>>> > Regards,
> >>>>>>>> > Vikas Agarwal
> >>>>>>>> > 91 – 9928301411
> >>>>>>>> >
> >>>>>>>> > InfoObjects, Inc.
> >>>>>>>> > Execution Matters
> >>>>>>>> > http://www.infoobjects.com
> >>>>>>>> > 2041 Mission College Boulevard, #280
> >>>>>>>> > Santa Clara, CA 95054
> >>>>>>>> > +1 (408) 988-2000 Work
> >>>>>>>> > +1 (408) 716-2726 Fax
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>> Regards,
> >>>>>>> Vikas Agarwal
> >>>>>>> 91 – 9928301411
> >>>>>>>
> >>>>>>> InfoObjects, Inc.
> >>>>>>> Execution Matters
> >>>>>>> http://www.infoobjects.com
> >>>>>>> 2041 Mission College Boulevard, #280
> >>>>>>> Santa Clara, CA 95054
> >>>>>>> +1 (408) 988-2000 Work
> >>>>>>> +1 (408) 716-2726 Fax
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> Regards,
> >>>>>> Vikas Agarwal
> >>>>>> 91 – 9928301411
> >>>>>>
> >>>>>> InfoObjects, Inc.
> >>>>>> Execution Matters
> >>>>>> http://www.infoobjects.com
> >>>>>> 2041 Mission College Boulevard, #280
> >>>>>> Santa Clara, CA 95054
> >>>>>> +1 (408) 988-2000 Work
> >>>>>> +1 (408) 716-2726 Fax
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Regards,
> >>>> Vikas Agarwal
> >>>> 91 – 9928301411
> >>>>
> >>>> InfoObjects, Inc.
> >>>> Execution Matters
> >>>> http://www.infoobjects.com
> >>>> 2041 Mission College Boulevard, #280
> >>>> Santa Clara, CA 95054
> >>>> +1 (408) 988-2000 Work
> >>>> +1 (408) 716-2726 Fax
> >>>
> >>>
> >>
> >>
> >>
> >> --
> >> Regards,
> >> Vikas Agarwal
> >> 91 – 9928301411
> >>
> >> InfoObjects, Inc.
> >> Execution Matters
> >> http://www.infoobjects.com
> >> 2041 Mission College Boulevard, #280
> >> Santa Clara, CA 95054
> >> +1 (408) 988-2000 Work
> >> +1 (408) 716-2726 Fax
> >
> >
>



-- 
Regards,
Vikas Agarwal
91 – 9928301411

InfoObjects, Inc.
Execution Matters
http://www.infoobjects.com
2041 Mission College Boulevard, #280
Santa Clara, CA 95054
+1 (408) 988-2000 Work
+1 (408) 716-2726 Fax

Re: Phoenix response time

Posted by Andrew Purtell <ap...@apache.org>.
Certainly the EMR support makes it easy to deploy a test cluster to
kick around the tires.

On Sat, Sep 6, 2014 at 10:03 AM, James Taylor <ja...@apache.org> wrote:
> I don't have experience running Phoenix in AWS. Andrew Purtell is a
> good person to ask. I'm curious if our support under their EMR helps
> in any way: http://phoenix.apache.org/phoenix_on_emr.html
>
> Thanks,
> James
>
> On Sat, Sep 6, 2014 at 12:27 AM, Alex Kamil <al...@gmail.com> wrote:
>> not sure, that's not my experience with phoenix, but if you have unstable
>> network connection to your storage (which is EBS is well known for) it may
>> affect the results
>>
>>
>> On Sat, Sep 6, 2014 at 3:14 AM, Vikas Agarwal <vi...@infoobjects.com> wrote:
>>>
>>> Of course, I can do a lot of optimizations. However, my concern is that
>>> what I am missing that is causing Phoenix to perform bad while exactly on
>>> same time, Hbase is giving results amazingly fast.
>>>
>>>
>>> On Sat, Sep 6, 2014 at 12:41 PM, Alex Kamil <al...@gmail.com> wrote:
>>>>
>>>> well it is still network attached, If you allocate enough heap to fit the
>>>> whole thing in memory (in hbase/conf/hbase-env.sh) you could probably
>>>> eliminate this as a possible reason
>>>>
>>>>
>>>> On Sat, Sep 6, 2014 at 2:43 AM, Vikas Agarwal <vi...@infoobjects.com>
>>>> wrote:
>>>>>
>>>>> EBS but with new generation SSD not magnetic one.
>>>>>
>>>>>
>>>>> On Sat, Sep 6, 2014 at 12:11 PM, Alex Kamil <al...@gmail.com>
>>>>> wrote:
>>>>>>
>>>>>> do you use EBS or ephemeral storage, I found EBS performance to be
>>>>>> somewhat unpredictable
>>>>>>
>>>>>>
>>>>>> On Sat, Sep 6, 2014 at 2:37 AM, Vikas Agarwal <vi...@infoobjects.com>
>>>>>> wrote:
>>>>>>>
>>>>>>> Hbase is 0.98.0
>>>>>>> Phoenix is 4.0
>>>>>>>
>>>>>>>
>>>>>>> On Sat, Sep 6, 2014 at 12:04 PM, Vikas Agarwal <vi...@infoobjects.com>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Yes, that is why it is a trouble for me. However, on contrary, HBase
>>>>>>>> shell is also on the same machine and same environment, so if it is an issue
>>>>>>>> of resource (CPU or memory) it should have affected the HBase too, but HBase
>>>>>>>> is able to give me results within 0.0150 seconds. :(
>>>>>>>>
>>>>>>>> No, I haven't tested it outside AWS. I guess, it should not be the
>>>>>>>> case due to much better performance by native HBase query on HBase shell.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Sat, Sep 6, 2014 at 11:59 AM, James Taylor
>>>>>>>> <ja...@apache.org> wrote:
>>>>>>>>>
>>>>>>>>> Something is up in your environment. What version of Phoenix and
>>>>>>>>> HBase
>>>>>>>>> are you using and in what environment? Have you tried this locally,
>>>>>>>>> outside of AWS to compare?
>>>>>>>>>
>>>>>>>>> Take a look at our perf numbers, generated more-or-less daily, and
>>>>>>>>> which run over more data that what you're testing against:
>>>>>>>>>
>>>>>>>>> http://phoenix-bin.github.io/client/performance/phoenix-20140904095313.htm
>>>>>>>>>
>>>>>>>>> Some of these are point queries and they take in the neighborhood of
>>>>>>>>> 0.01 seconds.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> James
>>>>>>>>>
>>>>>>>>> On Fri, Sep 5, 2014 at 10:48 PM, Vikas Agarwal
>>>>>>>>> <vi...@infoobjects.com> wrote:
>>>>>>>>> > Missed to mention that count query (posted in my last mail) is
>>>>>>>>> > also taking
>>>>>>>>> > very long time to return the count.
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>> > On Sat, Sep 6, 2014 at 11:17 AM, Vikas Agarwal
>>>>>>>>> > <vi...@infoobjects.com>
>>>>>>>>> > wrote:
>>>>>>>>> >>
>>>>>>>>> >> As I mentioned, schema is nothing but bunch of fields (some being
>>>>>>>>> >> integers, longs and text) along with primary key (row key) and I
>>>>>>>>> >> am making
>>>>>>>>> >> simple query to get result for a particular primary key, nothing
>>>>>>>>> >> more than
>>>>>>>>> >> that.
>>>>>>>>> >>
>>>>>>>>> >> 0: jdbc:phoenix:localhost> SELECT count(1) FROM table_name;
>>>>>>>>> >>
>>>>>>>>> >> +------------+
>>>>>>>>> >>
>>>>>>>>> >> |  COUNT(1)  |
>>>>>>>>> >>
>>>>>>>>> >> +------------+
>>>>>>>>> >>
>>>>>>>>> >> | 4667515    |
>>>>>>>>> >>
>>>>>>>>> >> +------------+
>>>>>>>>> >>
>>>>>>>>> >> 1 row selected (132.11 seconds)
>>>>>>>>> >>
>>>>>>>>> >>
>>>>>>>>> >>
>>>>>>>>> >> On Sat, Sep 6, 2014 at 11:09 AM, Puneet Kumar Ojha
>>>>>>>>> >> <pu...@pubmatic.com> wrote:
>>>>>>>>> >>>
>>>>>>>>> >>> If you can share the schema,data type,cardinality of each
>>>>>>>>> >>> dimension and
>>>>>>>>> >>> usual queries, I can help to design a schema with performance of
>>>>>>>>> >>> less than 1
>>>>>>>>> >>> sec using Phoenix.
>>>>>>>>> >>>
>>>>>>>>> >>>
>>>>>>>>> >>>
>>>>>>>>> >>> Thanks
>>>>>>>>> >>>
>>>>>>>>> >>>
>>>>>>>>> >>>
>>>>>>>>> >>>
>>>>>>>>> >>>
>>>>>>>>> >>> ------ Original message------
>>>>>>>>> >>>
>>>>>>>>> >>> From: James Taylor
>>>>>>>>> >>>
>>>>>>>>> >>> Date: Sat, Sep 6, 2014 10:15 AM
>>>>>>>>> >>>
>>>>>>>>> >>> To: user;
>>>>>>>>> >>>
>>>>>>>>> >>> Subject:Re: Phoenix response time
>>>>>>>>> >>>
>>>>>>>>> >>>
>>>>>>>>> >>>
>>>>>>>>> >>> Vikas,
>>>>>>>>> >>> Please post your schema and query.
>>>>>>>>> >>> Thanks,
>>>>>>>>> >>> James
>>>>>>>>> >>>
>>>>>>>>> >>> On Fri, Sep 5, 2014 at 9:18 PM, Vikas Agarwal
>>>>>>>>> >>> <vi...@infoobjects.com>
>>>>>>>>> >>> wrote:
>>>>>>>>> >>> > Ours is also a single node setup right now and as of now there
>>>>>>>>> >>> > are less
>>>>>>>>> >>> > than
>>>>>>>>> >>> > 1 million rows which is expected to grow around 100m at
>>>>>>>>> >>> > minimum.
>>>>>>>>> >>> >
>>>>>>>>> >>> > I am aware of secondary indexes but when I am querying on
>>>>>>>>> >>> > primary/row
>>>>>>>>> >>> > key,
>>>>>>>>> >>> > why would it take so much time?
>>>>>>>>> >>> >
>>>>>>>>> >>> > I am directly querying using sqlline for Phoenix and hbase
>>>>>>>>> >>> > shell for
>>>>>>>>> >>> > HBase
>>>>>>>>> >>> > query. I am not expecting to do any fine tuning for such small
>>>>>>>>> >>> > dataset.
>>>>>>>>> >>> > I am
>>>>>>>>> >>> > assumimg a minimum performance level out of the box.
>>>>>>>>> >>> >
>>>>>>>>> >>> > On Friday, September 5, 2014, yeshwanth kumar
>>>>>>>>> >>> > <ye...@gmail.com>
>>>>>>>>> >>> > wrote:
>>>>>>>>> >>> >>
>>>>>>>>> >>> >> hi vikas,
>>>>>>>>> >>> >>
>>>>>>>>> >>> >> we used phoenix on a 4 core/23Gb machine, as a single node
>>>>>>>>> >>> >> setup.
>>>>>>>>> >>> >> used HDP 2.1
>>>>>>>>> >>> >> our table has 50-70M rows,
>>>>>>>>> >>> >> select on that table took less than 2 seconds.
>>>>>>>>> >>> >> Aggregation queries took less than 8 seconds.
>>>>>>>>> >>> >> for achieving good performance we created secondary index on
>>>>>>>>> >>> >> the
>>>>>>>>> >>> >> table.
>>>>>>>>> >>> >>
>>>>>>>>> >>> >> make sure you finetuned hbase,
>>>>>>>>> >>> >> enabling compression on the data makes a difference in
>>>>>>>>> >>> >> response.
>>>>>>>>> >>> >> if u distribute the data and load over all regions in hbase,
>>>>>>>>> >>> >> look at the performance tips mentioned in phoenix blog
>>>>>>>>> >>> >>
>>>>>>>>> >>> >> -yeshwanth
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>
>>>>>>>>> >>> >> Cheers,
>>>>>>>>> >>> >> Yeshwanth
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>
>>>>>>>>> >>> >> On Fri, Sep 5, 2014 at 5:42 PM, Vikas Agarwal
>>>>>>>>> >>> >> <vi...@infoobjects.com>
>>>>>>>>> >>> >> wrote:
>>>>>>>>> >>> >>>
>>>>>>>>> >>> >>> Hi,
>>>>>>>>> >>> >>>
>>>>>>>>> >>> >>> Preface: We are testing phoenix using Hortonworks
>>>>>>>>> >>> >>> distribution for
>>>>>>>>> >>> >>> HBase
>>>>>>>>> >>> >>> on Amazon EC2 instance (r3.large, 2 CPU/15 GB RAM).
>>>>>>>>> >>> >>>
>>>>>>>>> >>> >>> With contrast to performance benchmarks, I found Phoenix to
>>>>>>>>> >>> >>> be very
>>>>>>>>> >>> >>> slow
>>>>>>>>> >>> >>> in querying even on primary key or row key. So, tried to
>>>>>>>>> >>> >>> increase the
>>>>>>>>> >>> >>> RAM
>>>>>>>>> >>> >>> for HBase and Phoenix and increasing the CPU and RAM by
>>>>>>>>> >>> >>> upgrading the
>>>>>>>>> >>> >>> EC2
>>>>>>>>> >>> >>> machine type to r3.xlarge (4 CPU, 30 GB RAM). Results were
>>>>>>>>> >>> >>> like this:
>>>>>>>>> >>> >>>
>>>>>>>>> >>> >>> Time takes in returning result of query on row key:
>>>>>>>>> >>> >>> With Storm running and very less RAM available: 50 sec
>>>>>>>>> >>> >>>
>>>>>>>>> >>> >>> With Storm stopped and RAM available to Phoenix and HBase:
>>>>>>>>> >>> >>> 18 sec
>>>>>>>>> >>> >>>
>>>>>>>>> >>> >>> With new machine of next higher category (4 CPU and 30 GB
>>>>>>>>> >>> >>> RAM): 8 sec
>>>>>>>>> >>> >>>
>>>>>>>>> >>> >>> Pure HBase query by row key with Storm stopped and (2 CPU,
>>>>>>>>> >>> >>> 15 GB
>>>>>>>>> >>> >>> RAM):
>>>>>>>>> >>> >>> 0.0150 seconds. :)
>>>>>>>>> >>> >>>
>>>>>>>>> >>> >>> So, the difference seems to be many fold of what native
>>>>>>>>> >>> >>> HBase is
>>>>>>>>> >>> >>> providing to us. I am not able to understand how it can be
>>>>>>>>> >>> >>> possible?
>>>>>>>>> >>> >>> What I
>>>>>>>>> >>> >>> am missing here?
>>>>>>>>> >>> >>>
>>>>>>>>> >>> >>> --
>>>>>>>>> >>> >>> Regards,
>>>>>>>>> >>> >>> Vikas Agarwal
>>>>>>>>> >>> >>> 91 – 9928301411
>>>>>>>>> >>> >>>
>>>>>>>>> >>> >>> InfoObjects, Inc.
>>>>>>>>> >>> >>> Execution Matters
>>>>>>>>> >>> >>> http://www.infoobjects.com
>>>>>>>>> >>> >>> 2041 Mission College Boulevard, #280
>>>>>>>>> >>> >>> Santa Clara, CA 95054
>>>>>>>>> >>> >>> +1 (408) 988-2000 Work
>>>>>>>>> >>> >>> +1 (408) 716-2726 Fax
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>
>>>>>>>>> >>> >
>>>>>>>>> >>> >
>>>>>>>>> >>> > --
>>>>>>>>> >>> > Regards,
>>>>>>>>> >>> > Vikas Agarwal
>>>>>>>>> >>> > 91 – 9928301411
>>>>>>>>> >>> >
>>>>>>>>> >>> > InfoObjects, Inc.
>>>>>>>>> >>> > Execution Matters
>>>>>>>>> >>> > http://www.infoobjects.com
>>>>>>>>> >>> > 2041 Mission College Boulevard, #280
>>>>>>>>> >>> > Santa Clara, CA 95054
>>>>>>>>> >>> > +1 (408) 988-2000 Work
>>>>>>>>> >>> > +1 (408) 716-2726 Fax
>>>>>>>>> >>> >
>>>>>>>>> >>> >
>>>>>>>>> >>
>>>>>>>>> >>
>>>>>>>>> >>
>>>>>>>>> >>
>>>>>>>>> >> --
>>>>>>>>> >> Regards,
>>>>>>>>> >> Vikas Agarwal
>>>>>>>>> >> 91 – 9928301411
>>>>>>>>> >>
>>>>>>>>> >> InfoObjects, Inc.
>>>>>>>>> >> Execution Matters
>>>>>>>>> >> http://www.infoobjects.com
>>>>>>>>> >> 2041 Mission College Boulevard, #280
>>>>>>>>> >> Santa Clara, CA 95054
>>>>>>>>> >> +1 (408) 988-2000 Work
>>>>>>>>> >> +1 (408) 716-2726 Fax
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>> > --
>>>>>>>>> > Regards,
>>>>>>>>> > Vikas Agarwal
>>>>>>>>> > 91 – 9928301411
>>>>>>>>> >
>>>>>>>>> > InfoObjects, Inc.
>>>>>>>>> > Execution Matters
>>>>>>>>> > http://www.infoobjects.com
>>>>>>>>> > 2041 Mission College Boulevard, #280
>>>>>>>>> > Santa Clara, CA 95054
>>>>>>>>> > +1 (408) 988-2000 Work
>>>>>>>>> > +1 (408) 716-2726 Fax
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Regards,
>>>>>>>> Vikas Agarwal
>>>>>>>> 91 – 9928301411
>>>>>>>>
>>>>>>>> InfoObjects, Inc.
>>>>>>>> Execution Matters
>>>>>>>> http://www.infoobjects.com
>>>>>>>> 2041 Mission College Boulevard, #280
>>>>>>>> Santa Clara, CA 95054
>>>>>>>> +1 (408) 988-2000 Work
>>>>>>>> +1 (408) 716-2726 Fax
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Regards,
>>>>>>> Vikas Agarwal
>>>>>>> 91 – 9928301411
>>>>>>>
>>>>>>> InfoObjects, Inc.
>>>>>>> Execution Matters
>>>>>>> http://www.infoobjects.com
>>>>>>> 2041 Mission College Boulevard, #280
>>>>>>> Santa Clara, CA 95054
>>>>>>> +1 (408) 988-2000 Work
>>>>>>> +1 (408) 716-2726 Fax
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Regards,
>>>>> Vikas Agarwal
>>>>> 91 – 9928301411
>>>>>
>>>>> InfoObjects, Inc.
>>>>> Execution Matters
>>>>> http://www.infoobjects.com
>>>>> 2041 Mission College Boulevard, #280
>>>>> Santa Clara, CA 95054
>>>>> +1 (408) 988-2000 Work
>>>>> +1 (408) 716-2726 Fax
>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> Regards,
>>> Vikas Agarwal
>>> 91 – 9928301411
>>>
>>> InfoObjects, Inc.
>>> Execution Matters
>>> http://www.infoobjects.com
>>> 2041 Mission College Boulevard, #280
>>> Santa Clara, CA 95054
>>> +1 (408) 988-2000 Work
>>> +1 (408) 716-2726 Fax
>>
>>



-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet
Hein (via Tom White)

Re: Phoenix response time

Posted by James Taylor <ja...@apache.org>.
I don't have experience running Phoenix in AWS. Andrew Purtell is a
good person to ask. I'm curious if our support under their EMR helps
in any way: http://phoenix.apache.org/phoenix_on_emr.html

Thanks,
James

On Sat, Sep 6, 2014 at 12:27 AM, Alex Kamil <al...@gmail.com> wrote:
> not sure, that's not my experience with phoenix, but if you have unstable
> network connection to your storage (which is EBS is well known for) it may
> affect the results
>
>
> On Sat, Sep 6, 2014 at 3:14 AM, Vikas Agarwal <vi...@infoobjects.com> wrote:
>>
>> Of course, I can do a lot of optimizations. However, my concern is that
>> what I am missing that is causing Phoenix to perform bad while exactly on
>> same time, Hbase is giving results amazingly fast.
>>
>>
>> On Sat, Sep 6, 2014 at 12:41 PM, Alex Kamil <al...@gmail.com> wrote:
>>>
>>> well it is still network attached, If you allocate enough heap to fit the
>>> whole thing in memory (in hbase/conf/hbase-env.sh) you could probably
>>> eliminate this as a possible reason
>>>
>>>
>>> On Sat, Sep 6, 2014 at 2:43 AM, Vikas Agarwal <vi...@infoobjects.com>
>>> wrote:
>>>>
>>>> EBS but with new generation SSD not magnetic one.
>>>>
>>>>
>>>> On Sat, Sep 6, 2014 at 12:11 PM, Alex Kamil <al...@gmail.com>
>>>> wrote:
>>>>>
>>>>> do you use EBS or ephemeral storage, I found EBS performance to be
>>>>> somewhat unpredictable
>>>>>
>>>>>
>>>>> On Sat, Sep 6, 2014 at 2:37 AM, Vikas Agarwal <vi...@infoobjects.com>
>>>>> wrote:
>>>>>>
>>>>>> Hbase is 0.98.0
>>>>>> Phoenix is 4.0
>>>>>>
>>>>>>
>>>>>> On Sat, Sep 6, 2014 at 12:04 PM, Vikas Agarwal <vi...@infoobjects.com>
>>>>>> wrote:
>>>>>>>
>>>>>>> Yes, that is why it is a trouble for me. However, on contrary, HBase
>>>>>>> shell is also on the same machine and same environment, so if it is an issue
>>>>>>> of resource (CPU or memory) it should have affected the HBase too, but HBase
>>>>>>> is able to give me results within 0.0150 seconds. :(
>>>>>>>
>>>>>>> No, I haven't tested it outside AWS. I guess, it should not be the
>>>>>>> case due to much better performance by native HBase query on HBase shell.
>>>>>>>
>>>>>>>
>>>>>>> On Sat, Sep 6, 2014 at 11:59 AM, James Taylor
>>>>>>> <ja...@apache.org> wrote:
>>>>>>>>
>>>>>>>> Something is up in your environment. What version of Phoenix and
>>>>>>>> HBase
>>>>>>>> are you using and in what environment? Have you tried this locally,
>>>>>>>> outside of AWS to compare?
>>>>>>>>
>>>>>>>> Take a look at our perf numbers, generated more-or-less daily, and
>>>>>>>> which run over more data that what you're testing against:
>>>>>>>>
>>>>>>>> http://phoenix-bin.github.io/client/performance/phoenix-20140904095313.htm
>>>>>>>>
>>>>>>>> Some of these are point queries and they take in the neighborhood of
>>>>>>>> 0.01 seconds.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> James
>>>>>>>>
>>>>>>>> On Fri, Sep 5, 2014 at 10:48 PM, Vikas Agarwal
>>>>>>>> <vi...@infoobjects.com> wrote:
>>>>>>>> > Missed to mention that count query (posted in my last mail) is
>>>>>>>> > also taking
>>>>>>>> > very long time to return the count.
>>>>>>>> >
>>>>>>>> >
>>>>>>>> > On Sat, Sep 6, 2014 at 11:17 AM, Vikas Agarwal
>>>>>>>> > <vi...@infoobjects.com>
>>>>>>>> > wrote:
>>>>>>>> >>
>>>>>>>> >> As I mentioned, schema is nothing but bunch of fields (some being
>>>>>>>> >> integers, longs and text) along with primary key (row key) and I
>>>>>>>> >> am making
>>>>>>>> >> simple query to get result for a particular primary key, nothing
>>>>>>>> >> more than
>>>>>>>> >> that.
>>>>>>>> >>
>>>>>>>> >> 0: jdbc:phoenix:localhost> SELECT count(1) FROM table_name;
>>>>>>>> >>
>>>>>>>> >> +------------+
>>>>>>>> >>
>>>>>>>> >> |  COUNT(1)  |
>>>>>>>> >>
>>>>>>>> >> +------------+
>>>>>>>> >>
>>>>>>>> >> | 4667515    |
>>>>>>>> >>
>>>>>>>> >> +------------+
>>>>>>>> >>
>>>>>>>> >> 1 row selected (132.11 seconds)
>>>>>>>> >>
>>>>>>>> >>
>>>>>>>> >>
>>>>>>>> >> On Sat, Sep 6, 2014 at 11:09 AM, Puneet Kumar Ojha
>>>>>>>> >> <pu...@pubmatic.com> wrote:
>>>>>>>> >>>
>>>>>>>> >>> If you can share the schema,data type,cardinality of each
>>>>>>>> >>> dimension and
>>>>>>>> >>> usual queries, I can help to design a schema with performance of
>>>>>>>> >>> less than 1
>>>>>>>> >>> sec using Phoenix.
>>>>>>>> >>>
>>>>>>>> >>>
>>>>>>>> >>>
>>>>>>>> >>> Thanks
>>>>>>>> >>>
>>>>>>>> >>>
>>>>>>>> >>>
>>>>>>>> >>>
>>>>>>>> >>>
>>>>>>>> >>> ------ Original message------
>>>>>>>> >>>
>>>>>>>> >>> From: James Taylor
>>>>>>>> >>>
>>>>>>>> >>> Date: Sat, Sep 6, 2014 10:15 AM
>>>>>>>> >>>
>>>>>>>> >>> To: user;
>>>>>>>> >>>
>>>>>>>> >>> Subject:Re: Phoenix response time
>>>>>>>> >>>
>>>>>>>> >>>
>>>>>>>> >>>
>>>>>>>> >>> Vikas,
>>>>>>>> >>> Please post your schema and query.
>>>>>>>> >>> Thanks,
>>>>>>>> >>> James
>>>>>>>> >>>
>>>>>>>> >>> On Fri, Sep 5, 2014 at 9:18 PM, Vikas Agarwal
>>>>>>>> >>> <vi...@infoobjects.com>
>>>>>>>> >>> wrote:
>>>>>>>> >>> > Ours is also a single node setup right now and as of now there
>>>>>>>> >>> > are less
>>>>>>>> >>> > than
>>>>>>>> >>> > 1 million rows which is expected to grow around 100m at
>>>>>>>> >>> > minimum.
>>>>>>>> >>> >
>>>>>>>> >>> > I am aware of secondary indexes but when I am querying on
>>>>>>>> >>> > primary/row
>>>>>>>> >>> > key,
>>>>>>>> >>> > why would it take so much time?
>>>>>>>> >>> >
>>>>>>>> >>> > I am directly querying using sqlline for Phoenix and hbase
>>>>>>>> >>> > shell for
>>>>>>>> >>> > HBase
>>>>>>>> >>> > query. I am not expecting to do any fine tuning for such small
>>>>>>>> >>> > dataset.
>>>>>>>> >>> > I am
>>>>>>>> >>> > assumimg a minimum performance level out of the box.
>>>>>>>> >>> >
>>>>>>>> >>> > On Friday, September 5, 2014, yeshwanth kumar
>>>>>>>> >>> > <ye...@gmail.com>
>>>>>>>> >>> > wrote:
>>>>>>>> >>> >>
>>>>>>>> >>> >> hi vikas,
>>>>>>>> >>> >>
>>>>>>>> >>> >> we used phoenix on a 4 core/23Gb machine, as a single node
>>>>>>>> >>> >> setup.
>>>>>>>> >>> >> used HDP 2.1
>>>>>>>> >>> >> our table has 50-70M rows,
>>>>>>>> >>> >> select on that table took less than 2 seconds.
>>>>>>>> >>> >> Aggregation queries took less than 8 seconds.
>>>>>>>> >>> >> for achieving good performance we created secondary index on
>>>>>>>> >>> >> the
>>>>>>>> >>> >> table.
>>>>>>>> >>> >>
>>>>>>>> >>> >> make sure you finetuned hbase,
>>>>>>>> >>> >> enabling compression on the data makes a difference in
>>>>>>>> >>> >> response.
>>>>>>>> >>> >> if u distribute the data and load over all regions in hbase,
>>>>>>>> >>> >> look at the performance tips mentioned in phoenix blog
>>>>>>>> >>> >>
>>>>>>>> >>> >> -yeshwanth
>>>>>>>> >>> >>
>>>>>>>> >>> >>
>>>>>>>> >>> >>
>>>>>>>> >>> >> Cheers,
>>>>>>>> >>> >> Yeshwanth
>>>>>>>> >>> >>
>>>>>>>> >>> >>
>>>>>>>> >>> >>
>>>>>>>> >>> >> On Fri, Sep 5, 2014 at 5:42 PM, Vikas Agarwal
>>>>>>>> >>> >> <vi...@infoobjects.com>
>>>>>>>> >>> >> wrote:
>>>>>>>> >>> >>>
>>>>>>>> >>> >>> Hi,
>>>>>>>> >>> >>>
>>>>>>>> >>> >>> Preface: We are testing phoenix using Hortonworks
>>>>>>>> >>> >>> distribution for
>>>>>>>> >>> >>> HBase
>>>>>>>> >>> >>> on Amazon EC2 instance (r3.large, 2 CPU/15 GB RAM).
>>>>>>>> >>> >>>
>>>>>>>> >>> >>> With contrast to performance benchmarks, I found Phoenix to
>>>>>>>> >>> >>> be very
>>>>>>>> >>> >>> slow
>>>>>>>> >>> >>> in querying even on primary key or row key. So, tried to
>>>>>>>> >>> >>> increase the
>>>>>>>> >>> >>> RAM
>>>>>>>> >>> >>> for HBase and Phoenix and increasing the CPU and RAM by
>>>>>>>> >>> >>> upgrading the
>>>>>>>> >>> >>> EC2
>>>>>>>> >>> >>> machine type to r3.xlarge (4 CPU, 30 GB RAM). Results were
>>>>>>>> >>> >>> like this:
>>>>>>>> >>> >>>
>>>>>>>> >>> >>> Time takes in returning result of query on row key:
>>>>>>>> >>> >>> With Storm running and very less RAM available: 50 sec
>>>>>>>> >>> >>>
>>>>>>>> >>> >>> With Storm stopped and RAM available to Phoenix and HBase:
>>>>>>>> >>> >>> 18 sec
>>>>>>>> >>> >>>
>>>>>>>> >>> >>> With new machine of next higher category (4 CPU and 30 GB
>>>>>>>> >>> >>> RAM): 8 sec
>>>>>>>> >>> >>>
>>>>>>>> >>> >>> Pure HBase query by row key with Storm stopped and (2 CPU,
>>>>>>>> >>> >>> 15 GB
>>>>>>>> >>> >>> RAM):
>>>>>>>> >>> >>> 0.0150 seconds. :)
>>>>>>>> >>> >>>
>>>>>>>> >>> >>> So, the difference seems to be many fold of what native
>>>>>>>> >>> >>> HBase is
>>>>>>>> >>> >>> providing to us. I am not able to understand how it can be
>>>>>>>> >>> >>> possible?
>>>>>>>> >>> >>> What I
>>>>>>>> >>> >>> am missing here?
>>>>>>>> >>> >>>
>>>>>>>> >>> >>> --
>>>>>>>> >>> >>> Regards,
>>>>>>>> >>> >>> Vikas Agarwal
>>>>>>>> >>> >>> 91 – 9928301411
>>>>>>>> >>> >>>
>>>>>>>> >>> >>> InfoObjects, Inc.
>>>>>>>> >>> >>> Execution Matters
>>>>>>>> >>> >>> http://www.infoobjects.com
>>>>>>>> >>> >>> 2041 Mission College Boulevard, #280
>>>>>>>> >>> >>> Santa Clara, CA 95054
>>>>>>>> >>> >>> +1 (408) 988-2000 Work
>>>>>>>> >>> >>> +1 (408) 716-2726 Fax
>>>>>>>> >>> >>
>>>>>>>> >>> >>
>>>>>>>> >>> >
>>>>>>>> >>> >
>>>>>>>> >>> > --
>>>>>>>> >>> > Regards,
>>>>>>>> >>> > Vikas Agarwal
>>>>>>>> >>> > 91 – 9928301411
>>>>>>>> >>> >
>>>>>>>> >>> > InfoObjects, Inc.
>>>>>>>> >>> > Execution Matters
>>>>>>>> >>> > http://www.infoobjects.com
>>>>>>>> >>> > 2041 Mission College Boulevard, #280
>>>>>>>> >>> > Santa Clara, CA 95054
>>>>>>>> >>> > +1 (408) 988-2000 Work
>>>>>>>> >>> > +1 (408) 716-2726 Fax
>>>>>>>> >>> >
>>>>>>>> >>> >
>>>>>>>> >>
>>>>>>>> >>
>>>>>>>> >>
>>>>>>>> >>
>>>>>>>> >> --
>>>>>>>> >> Regards,
>>>>>>>> >> Vikas Agarwal
>>>>>>>> >> 91 – 9928301411
>>>>>>>> >>
>>>>>>>> >> InfoObjects, Inc.
>>>>>>>> >> Execution Matters
>>>>>>>> >> http://www.infoobjects.com
>>>>>>>> >> 2041 Mission College Boulevard, #280
>>>>>>>> >> Santa Clara, CA 95054
>>>>>>>> >> +1 (408) 988-2000 Work
>>>>>>>> >> +1 (408) 716-2726 Fax
>>>>>>>> >
>>>>>>>> >
>>>>>>>> >
>>>>>>>> >
>>>>>>>> > --
>>>>>>>> > Regards,
>>>>>>>> > Vikas Agarwal
>>>>>>>> > 91 – 9928301411
>>>>>>>> >
>>>>>>>> > InfoObjects, Inc.
>>>>>>>> > Execution Matters
>>>>>>>> > http://www.infoobjects.com
>>>>>>>> > 2041 Mission College Boulevard, #280
>>>>>>>> > Santa Clara, CA 95054
>>>>>>>> > +1 (408) 988-2000 Work
>>>>>>>> > +1 (408) 716-2726 Fax
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Regards,
>>>>>>> Vikas Agarwal
>>>>>>> 91 – 9928301411
>>>>>>>
>>>>>>> InfoObjects, Inc.
>>>>>>> Execution Matters
>>>>>>> http://www.infoobjects.com
>>>>>>> 2041 Mission College Boulevard, #280
>>>>>>> Santa Clara, CA 95054
>>>>>>> +1 (408) 988-2000 Work
>>>>>>> +1 (408) 716-2726 Fax
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Regards,
>>>>>> Vikas Agarwal
>>>>>> 91 – 9928301411
>>>>>>
>>>>>> InfoObjects, Inc.
>>>>>> Execution Matters
>>>>>> http://www.infoobjects.com
>>>>>> 2041 Mission College Boulevard, #280
>>>>>> Santa Clara, CA 95054
>>>>>> +1 (408) 988-2000 Work
>>>>>> +1 (408) 716-2726 Fax
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Regards,
>>>> Vikas Agarwal
>>>> 91 – 9928301411
>>>>
>>>> InfoObjects, Inc.
>>>> Execution Matters
>>>> http://www.infoobjects.com
>>>> 2041 Mission College Boulevard, #280
>>>> Santa Clara, CA 95054
>>>> +1 (408) 988-2000 Work
>>>> +1 (408) 716-2726 Fax
>>>
>>>
>>
>>
>>
>> --
>> Regards,
>> Vikas Agarwal
>> 91 – 9928301411
>>
>> InfoObjects, Inc.
>> Execution Matters
>> http://www.infoobjects.com
>> 2041 Mission College Boulevard, #280
>> Santa Clara, CA 95054
>> +1 (408) 988-2000 Work
>> +1 (408) 716-2726 Fax
>
>

Re: Phoenix response time

Posted by Alex Kamil <al...@gmail.com>.
not sure, that's not my experience with phoenix, but if you have unstable
network connection to your storage (which is EBS is well known for) it may
affect the results


On Sat, Sep 6, 2014 at 3:14 AM, Vikas Agarwal <vi...@infoobjects.com> wrote:

> Of course, I can do a lot of optimizations. However, my concern is that
> what I am missing that is causing Phoenix to perform bad while exactly on
> same time, Hbase is giving results amazingly fast.
>
>
> On Sat, Sep 6, 2014 at 12:41 PM, Alex Kamil <al...@gmail.com> wrote:
>
>> well it is still network attached, If you allocate enough heap to fit the
>> whole thing in memory (in hbase/conf/hbase-env.sh) you could probably
>> eliminate this as a possible reason
>>
>>
>> On Sat, Sep 6, 2014 at 2:43 AM, Vikas Agarwal <vi...@infoobjects.com>
>> wrote:
>>
>>> EBS but with new generation SSD not magnetic one.
>>>
>>>
>>> On Sat, Sep 6, 2014 at 12:11 PM, Alex Kamil <al...@gmail.com>
>>> wrote:
>>>
>>>> do you use EBS or ephemeral storage, I found EBS performance to be
>>>> somewhat unpredictable
>>>>
>>>>
>>>> On Sat, Sep 6, 2014 at 2:37 AM, Vikas Agarwal <vi...@infoobjects.com>
>>>> wrote:
>>>>
>>>>> Hbase is 0.98.0
>>>>> Phoenix is 4.0
>>>>>
>>>>>
>>>>> On Sat, Sep 6, 2014 at 12:04 PM, Vikas Agarwal <vi...@infoobjects.com>
>>>>> wrote:
>>>>>
>>>>>> Yes, that is why it is a trouble for me. However, on contrary, HBase
>>>>>> shell is also on the same machine and same environment, so if it is an
>>>>>> issue of resource (CPU or memory) it should have affected the HBase too,
>>>>>> but HBase is able to give me results within 0.0150 seconds. :(
>>>>>>
>>>>>> No, I haven't tested it outside AWS. I guess, it should not be the
>>>>>> case due to much better performance by native HBase query on HBase shell.
>>>>>>
>>>>>>
>>>>>> On Sat, Sep 6, 2014 at 11:59 AM, James Taylor <jamestaylor@apache.org
>>>>>> > wrote:
>>>>>>
>>>>>>> Something is up in your environment. What version of Phoenix and
>>>>>>> HBase
>>>>>>> are you using and in what environment? Have you tried this locally,
>>>>>>> outside of AWS to compare?
>>>>>>>
>>>>>>> Take a look at our perf numbers, generated more-or-less daily, and
>>>>>>> which run over more data that what you're testing against:
>>>>>>>
>>>>>>> http://phoenix-bin.github.io/client/performance/phoenix-20140904095313.htm
>>>>>>>
>>>>>>> Some of these are point queries and they take in the neighborhood of
>>>>>>> 0.01 seconds.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> James
>>>>>>>
>>>>>>> On Fri, Sep 5, 2014 at 10:48 PM, Vikas Agarwal <
>>>>>>> vikas@infoobjects.com> wrote:
>>>>>>> > Missed to mention that count query (posted in my last mail) is
>>>>>>> also taking
>>>>>>> > very long time to return the count.
>>>>>>> >
>>>>>>> >
>>>>>>> > On Sat, Sep 6, 2014 at 11:17 AM, Vikas Agarwal <
>>>>>>> vikas@infoobjects.com>
>>>>>>> > wrote:
>>>>>>> >>
>>>>>>> >> As I mentioned, schema is nothing but bunch of fields (some being
>>>>>>> >> integers, longs and text) along with primary key (row key) and I
>>>>>>> am making
>>>>>>> >> simple query to get result for a particular primary key, nothing
>>>>>>> more than
>>>>>>> >> that.
>>>>>>> >>
>>>>>>> >> 0: jdbc:phoenix:localhost> SELECT count(1) FROM table_name;
>>>>>>> >>
>>>>>>> >> +------------+
>>>>>>> >>
>>>>>>> >> |  COUNT(1)  |
>>>>>>> >>
>>>>>>> >> +------------+
>>>>>>> >>
>>>>>>> >> | 4667515    |
>>>>>>> >>
>>>>>>> >> +------------+
>>>>>>> >>
>>>>>>> >> 1 row selected (132.11 seconds)
>>>>>>> >>
>>>>>>> >>
>>>>>>> >>
>>>>>>> >> On Sat, Sep 6, 2014 at 11:09 AM, Puneet Kumar Ojha
>>>>>>> >> <pu...@pubmatic.com> wrote:
>>>>>>> >>>
>>>>>>> >>> If you can share the schema,data type,cardinality of each
>>>>>>> dimension and
>>>>>>> >>> usual queries, I can help to design a schema with performance of
>>>>>>> less than 1
>>>>>>> >>> sec using Phoenix.
>>>>>>> >>>
>>>>>>> >>>
>>>>>>> >>>
>>>>>>> >>> Thanks
>>>>>>> >>>
>>>>>>> >>>
>>>>>>> >>>
>>>>>>> >>>
>>>>>>> >>>
>>>>>>> >>> ------ Original message------
>>>>>>> >>>
>>>>>>> >>> From: James Taylor
>>>>>>> >>>
>>>>>>> >>> Date: Sat, Sep 6, 2014 10:15 AM
>>>>>>> >>>
>>>>>>> >>> To: user;
>>>>>>> >>>
>>>>>>> >>> Subject:Re: Phoenix response time
>>>>>>> >>>
>>>>>>> >>>
>>>>>>> >>>
>>>>>>> >>> Vikas,
>>>>>>> >>> Please post your schema and query.
>>>>>>> >>> Thanks,
>>>>>>> >>> James
>>>>>>> >>>
>>>>>>> >>> On Fri, Sep 5, 2014 at 9:18 PM, Vikas Agarwal <
>>>>>>> vikas@infoobjects.com>
>>>>>>> >>> wrote:
>>>>>>> >>> > Ours is also a single node setup right now and as of now there
>>>>>>> are less
>>>>>>> >>> > than
>>>>>>> >>> > 1 million rows which is expected to grow around 100m at
>>>>>>> minimum.
>>>>>>> >>> >
>>>>>>> >>> > I am aware of secondary indexes but when I am querying on
>>>>>>> primary/row
>>>>>>> >>> > key,
>>>>>>> >>> > why would it take so much time?
>>>>>>> >>> >
>>>>>>> >>> > I am directly querying using sqlline for Phoenix and hbase
>>>>>>> shell for
>>>>>>> >>> > HBase
>>>>>>> >>> > query. I am not expecting to do any fine tuning for such small
>>>>>>> dataset.
>>>>>>> >>> > I am
>>>>>>> >>> > assumimg a minimum performance level out of the box.
>>>>>>> >>> >
>>>>>>> >>> > On Friday, September 5, 2014, yeshwanth kumar <
>>>>>>> yeshwanth43@gmail.com>
>>>>>>> >>> > wrote:
>>>>>>> >>> >>
>>>>>>> >>> >> hi vikas,
>>>>>>> >>> >>
>>>>>>> >>> >> we used phoenix on a 4 core/23Gb machine, as a single node
>>>>>>> setup.
>>>>>>> >>> >> used HDP 2.1
>>>>>>> >>> >> our table has 50-70M rows,
>>>>>>> >>> >> select on that table took less than 2 seconds.
>>>>>>> >>> >> Aggregation queries took less than 8 seconds.
>>>>>>> >>> >> for achieving good performance we created secondary index on
>>>>>>> the
>>>>>>> >>> >> table.
>>>>>>> >>> >>
>>>>>>> >>> >> make sure you finetuned hbase,
>>>>>>> >>> >> enabling compression on the data makes a difference in
>>>>>>> response.
>>>>>>> >>> >> if u distribute the data and load over all regions in hbase,
>>>>>>> >>> >> look at the performance tips mentioned in phoenix blog
>>>>>>> >>> >>
>>>>>>> >>> >> -yeshwanth
>>>>>>> >>> >>
>>>>>>> >>> >>
>>>>>>> >>> >>
>>>>>>> >>> >> Cheers,
>>>>>>> >>> >> Yeshwanth
>>>>>>> >>> >>
>>>>>>> >>> >>
>>>>>>> >>> >>
>>>>>>> >>> >> On Fri, Sep 5, 2014 at 5:42 PM, Vikas Agarwal <
>>>>>>> vikas@infoobjects.com>
>>>>>>> >>> >> wrote:
>>>>>>> >>> >>>
>>>>>>> >>> >>> Hi,
>>>>>>> >>> >>>
>>>>>>> >>> >>> Preface: We are testing phoenix using Hortonworks
>>>>>>> distribution for
>>>>>>> >>> >>> HBase
>>>>>>> >>> >>> on Amazon EC2 instance (r3.large, 2 CPU/15 GB RAM).
>>>>>>> >>> >>>
>>>>>>> >>> >>> With contrast to performance benchmarks, I found Phoenix to
>>>>>>> be very
>>>>>>> >>> >>> slow
>>>>>>> >>> >>> in querying even on primary key or row key. So, tried to
>>>>>>> increase the
>>>>>>> >>> >>> RAM
>>>>>>> >>> >>> for HBase and Phoenix and increasing the CPU and RAM by
>>>>>>> upgrading the
>>>>>>> >>> >>> EC2
>>>>>>> >>> >>> machine type to r3.xlarge (4 CPU, 30 GB RAM). Results were
>>>>>>> like this:
>>>>>>> >>> >>>
>>>>>>> >>> >>> Time takes in returning result of query on row key:
>>>>>>> >>> >>> With Storm running and very less RAM available: 50 sec
>>>>>>> >>> >>>
>>>>>>> >>> >>> With Storm stopped and RAM available to Phoenix and HBase:
>>>>>>> 18 sec
>>>>>>> >>> >>>
>>>>>>> >>> >>> With new machine of next higher category (4 CPU and 30 GB
>>>>>>> RAM): 8 sec
>>>>>>> >>> >>>
>>>>>>> >>> >>> Pure HBase query by row key with Storm stopped and (2 CPU,
>>>>>>> 15 GB
>>>>>>> >>> >>> RAM):
>>>>>>> >>> >>> 0.0150 seconds. :)
>>>>>>> >>> >>>
>>>>>>> >>> >>> So, the difference seems to be many fold of what native
>>>>>>> HBase is
>>>>>>> >>> >>> providing to us. I am not able to understand how it can be
>>>>>>> possible?
>>>>>>> >>> >>> What I
>>>>>>> >>> >>> am missing here?
>>>>>>> >>> >>>
>>>>>>> >>> >>> --
>>>>>>> >>> >>> Regards,
>>>>>>> >>> >>> Vikas Agarwal
>>>>>>> >>> >>> 91 – 9928301411
>>>>>>> >>> >>>
>>>>>>> >>> >>> InfoObjects, Inc.
>>>>>>> >>> >>> Execution Matters
>>>>>>> >>> >>> http://www.infoobjects.com
>>>>>>> >>> >>> 2041 Mission College Boulevard, #280
>>>>>>> >>> >>> Santa Clara, CA 95054
>>>>>>> >>> >>> +1 (408) 988-2000 Work
>>>>>>> >>> >>> +1 (408) 716-2726 Fax
>>>>>>> >>> >>
>>>>>>> >>> >>
>>>>>>> >>> >
>>>>>>> >>> >
>>>>>>> >>> > --
>>>>>>> >>> > Regards,
>>>>>>> >>> > Vikas Agarwal
>>>>>>> >>> > 91 – 9928301411
>>>>>>> >>> >
>>>>>>> >>> > InfoObjects, Inc.
>>>>>>> >>> > Execution Matters
>>>>>>> >>> > http://www.infoobjects.com
>>>>>>> >>> > 2041 Mission College Boulevard, #280
>>>>>>> >>> > Santa Clara, CA 95054
>>>>>>> >>> > +1 (408) 988-2000 Work
>>>>>>> >>> > +1 (408) 716-2726 Fax
>>>>>>> >>> >
>>>>>>> >>> >
>>>>>>> >>
>>>>>>> >>
>>>>>>> >>
>>>>>>> >>
>>>>>>> >> --
>>>>>>> >> Regards,
>>>>>>> >> Vikas Agarwal
>>>>>>> >> 91 – 9928301411
>>>>>>> >>
>>>>>>> >> InfoObjects, Inc.
>>>>>>> >> Execution Matters
>>>>>>> >> http://www.infoobjects.com
>>>>>>> >> 2041 Mission College Boulevard, #280
>>>>>>> >> Santa Clara, CA 95054
>>>>>>> >> +1 (408) 988-2000 Work
>>>>>>> >> +1 (408) 716-2726 Fax
>>>>>>> >
>>>>>>> >
>>>>>>> >
>>>>>>> >
>>>>>>> > --
>>>>>>> > Regards,
>>>>>>> > Vikas Agarwal
>>>>>>> > 91 – 9928301411
>>>>>>> >
>>>>>>> > InfoObjects, Inc.
>>>>>>> > Execution Matters
>>>>>>> > http://www.infoobjects.com
>>>>>>> > 2041 Mission College Boulevard, #280
>>>>>>> > Santa Clara, CA 95054
>>>>>>> > +1 (408) 988-2000 Work
>>>>>>> > +1 (408) 716-2726 Fax
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Regards,
>>>>>> Vikas Agarwal
>>>>>> 91 – 9928301411
>>>>>>
>>>>>> InfoObjects, Inc.
>>>>>> Execution Matters
>>>>>> http://www.infoobjects.com
>>>>>> 2041 Mission College Boulevard, #280
>>>>>> Santa Clara, CA 95054
>>>>>> +1 (408) 988-2000 Work
>>>>>> +1 (408) 716-2726 Fax
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Regards,
>>>>> Vikas Agarwal
>>>>> 91 – 9928301411
>>>>>
>>>>> InfoObjects, Inc.
>>>>> Execution Matters
>>>>> http://www.infoobjects.com
>>>>> 2041 Mission College Boulevard, #280
>>>>> Santa Clara, CA 95054
>>>>> +1 (408) 988-2000 Work
>>>>> +1 (408) 716-2726 Fax
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Regards,
>>> Vikas Agarwal
>>> 91 – 9928301411
>>>
>>> InfoObjects, Inc.
>>> Execution Matters
>>> http://www.infoobjects.com
>>> 2041 Mission College Boulevard, #280
>>> Santa Clara, CA 95054
>>> +1 (408) 988-2000 Work
>>> +1 (408) 716-2726 Fax
>>>
>>>
>>
>
>
> --
> Regards,
> Vikas Agarwal
> 91 – 9928301411
>
> InfoObjects, Inc.
> Execution Matters
> http://www.infoobjects.com
> 2041 Mission College Boulevard, #280
> Santa Clara, CA 95054
> +1 (408) 988-2000 Work
> +1 (408) 716-2726 Fax
>
>

Re: Phoenix response time

Posted by Vikas Agarwal <vi...@infoobjects.com>.
Of course, I can do a lot of optimizations. However, my concern is that
what I am missing that is causing Phoenix to perform bad while exactly on
same time, Hbase is giving results amazingly fast.


On Sat, Sep 6, 2014 at 12:41 PM, Alex Kamil <al...@gmail.com> wrote:

> well it is still network attached, If you allocate enough heap to fit the
> whole thing in memory (in hbase/conf/hbase-env.sh) you could probably
> eliminate this as a possible reason
>
>
> On Sat, Sep 6, 2014 at 2:43 AM, Vikas Agarwal <vi...@infoobjects.com>
> wrote:
>
>> EBS but with new generation SSD not magnetic one.
>>
>>
>> On Sat, Sep 6, 2014 at 12:11 PM, Alex Kamil <al...@gmail.com> wrote:
>>
>>> do you use EBS or ephemeral storage, I found EBS performance to be
>>> somewhat unpredictable
>>>
>>>
>>> On Sat, Sep 6, 2014 at 2:37 AM, Vikas Agarwal <vi...@infoobjects.com>
>>> wrote:
>>>
>>>> Hbase is 0.98.0
>>>> Phoenix is 4.0
>>>>
>>>>
>>>> On Sat, Sep 6, 2014 at 12:04 PM, Vikas Agarwal <vi...@infoobjects.com>
>>>> wrote:
>>>>
>>>>> Yes, that is why it is a trouble for me. However, on contrary, HBase
>>>>> shell is also on the same machine and same environment, so if it is an
>>>>> issue of resource (CPU or memory) it should have affected the HBase too,
>>>>> but HBase is able to give me results within 0.0150 seconds. :(
>>>>>
>>>>> No, I haven't tested it outside AWS. I guess, it should not be the
>>>>> case due to much better performance by native HBase query on HBase shell.
>>>>>
>>>>>
>>>>> On Sat, Sep 6, 2014 at 11:59 AM, James Taylor <ja...@apache.org>
>>>>> wrote:
>>>>>
>>>>>> Something is up in your environment. What version of Phoenix and HBase
>>>>>> are you using and in what environment? Have you tried this locally,
>>>>>> outside of AWS to compare?
>>>>>>
>>>>>> Take a look at our perf numbers, generated more-or-less daily, and
>>>>>> which run over more data that what you're testing against:
>>>>>>
>>>>>> http://phoenix-bin.github.io/client/performance/phoenix-20140904095313.htm
>>>>>>
>>>>>> Some of these are point queries and they take in the neighborhood of
>>>>>> 0.01 seconds.
>>>>>>
>>>>>> Thanks,
>>>>>> James
>>>>>>
>>>>>> On Fri, Sep 5, 2014 at 10:48 PM, Vikas Agarwal <vi...@infoobjects.com>
>>>>>> wrote:
>>>>>> > Missed to mention that count query (posted in my last mail) is also
>>>>>> taking
>>>>>> > very long time to return the count.
>>>>>> >
>>>>>> >
>>>>>> > On Sat, Sep 6, 2014 at 11:17 AM, Vikas Agarwal <
>>>>>> vikas@infoobjects.com>
>>>>>> > wrote:
>>>>>> >>
>>>>>> >> As I mentioned, schema is nothing but bunch of fields (some being
>>>>>> >> integers, longs and text) along with primary key (row key) and I
>>>>>> am making
>>>>>> >> simple query to get result for a particular primary key, nothing
>>>>>> more than
>>>>>> >> that.
>>>>>> >>
>>>>>> >> 0: jdbc:phoenix:localhost> SELECT count(1) FROM table_name;
>>>>>> >>
>>>>>> >> +------------+
>>>>>> >>
>>>>>> >> |  COUNT(1)  |
>>>>>> >>
>>>>>> >> +------------+
>>>>>> >>
>>>>>> >> | 4667515    |
>>>>>> >>
>>>>>> >> +------------+
>>>>>> >>
>>>>>> >> 1 row selected (132.11 seconds)
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> On Sat, Sep 6, 2014 at 11:09 AM, Puneet Kumar Ojha
>>>>>> >> <pu...@pubmatic.com> wrote:
>>>>>> >>>
>>>>>> >>> If you can share the schema,data type,cardinality of each
>>>>>> dimension and
>>>>>> >>> usual queries, I can help to design a schema with performance of
>>>>>> less than 1
>>>>>> >>> sec using Phoenix.
>>>>>> >>>
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> Thanks
>>>>>> >>>
>>>>>> >>>
>>>>>> >>>
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> ------ Original message------
>>>>>> >>>
>>>>>> >>> From: James Taylor
>>>>>> >>>
>>>>>> >>> Date: Sat, Sep 6, 2014 10:15 AM
>>>>>> >>>
>>>>>> >>> To: user;
>>>>>> >>>
>>>>>> >>> Subject:Re: Phoenix response time
>>>>>> >>>
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> Vikas,
>>>>>> >>> Please post your schema and query.
>>>>>> >>> Thanks,
>>>>>> >>> James
>>>>>> >>>
>>>>>> >>> On Fri, Sep 5, 2014 at 9:18 PM, Vikas Agarwal <
>>>>>> vikas@infoobjects.com>
>>>>>> >>> wrote:
>>>>>> >>> > Ours is also a single node setup right now and as of now there
>>>>>> are less
>>>>>> >>> > than
>>>>>> >>> > 1 million rows which is expected to grow around 100m at minimum.
>>>>>> >>> >
>>>>>> >>> > I am aware of secondary indexes but when I am querying on
>>>>>> primary/row
>>>>>> >>> > key,
>>>>>> >>> > why would it take so much time?
>>>>>> >>> >
>>>>>> >>> > I am directly querying using sqlline for Phoenix and hbase
>>>>>> shell for
>>>>>> >>> > HBase
>>>>>> >>> > query. I am not expecting to do any fine tuning for such small
>>>>>> dataset.
>>>>>> >>> > I am
>>>>>> >>> > assumimg a minimum performance level out of the box.
>>>>>> >>> >
>>>>>> >>> > On Friday, September 5, 2014, yeshwanth kumar <
>>>>>> yeshwanth43@gmail.com>
>>>>>> >>> > wrote:
>>>>>> >>> >>
>>>>>> >>> >> hi vikas,
>>>>>> >>> >>
>>>>>> >>> >> we used phoenix on a 4 core/23Gb machine, as a single node
>>>>>> setup.
>>>>>> >>> >> used HDP 2.1
>>>>>> >>> >> our table has 50-70M rows,
>>>>>> >>> >> select on that table took less than 2 seconds.
>>>>>> >>> >> Aggregation queries took less than 8 seconds.
>>>>>> >>> >> for achieving good performance we created secondary index on
>>>>>> the
>>>>>> >>> >> table.
>>>>>> >>> >>
>>>>>> >>> >> make sure you finetuned hbase,
>>>>>> >>> >> enabling compression on the data makes a difference in
>>>>>> response.
>>>>>> >>> >> if u distribute the data and load over all regions in hbase,
>>>>>> >>> >> look at the performance tips mentioned in phoenix blog
>>>>>> >>> >>
>>>>>> >>> >> -yeshwanth
>>>>>> >>> >>
>>>>>> >>> >>
>>>>>> >>> >>
>>>>>> >>> >> Cheers,
>>>>>> >>> >> Yeshwanth
>>>>>> >>> >>
>>>>>> >>> >>
>>>>>> >>> >>
>>>>>> >>> >> On Fri, Sep 5, 2014 at 5:42 PM, Vikas Agarwal <
>>>>>> vikas@infoobjects.com>
>>>>>> >>> >> wrote:
>>>>>> >>> >>>
>>>>>> >>> >>> Hi,
>>>>>> >>> >>>
>>>>>> >>> >>> Preface: We are testing phoenix using Hortonworks
>>>>>> distribution for
>>>>>> >>> >>> HBase
>>>>>> >>> >>> on Amazon EC2 instance (r3.large, 2 CPU/15 GB RAM).
>>>>>> >>> >>>
>>>>>> >>> >>> With contrast to performance benchmarks, I found Phoenix to
>>>>>> be very
>>>>>> >>> >>> slow
>>>>>> >>> >>> in querying even on primary key or row key. So, tried to
>>>>>> increase the
>>>>>> >>> >>> RAM
>>>>>> >>> >>> for HBase and Phoenix and increasing the CPU and RAM by
>>>>>> upgrading the
>>>>>> >>> >>> EC2
>>>>>> >>> >>> machine type to r3.xlarge (4 CPU, 30 GB RAM). Results were
>>>>>> like this:
>>>>>> >>> >>>
>>>>>> >>> >>> Time takes in returning result of query on row key:
>>>>>> >>> >>> With Storm running and very less RAM available: 50 sec
>>>>>> >>> >>>
>>>>>> >>> >>> With Storm stopped and RAM available to Phoenix and HBase: 18
>>>>>> sec
>>>>>> >>> >>>
>>>>>> >>> >>> With new machine of next higher category (4 CPU and 30 GB
>>>>>> RAM): 8 sec
>>>>>> >>> >>>
>>>>>> >>> >>> Pure HBase query by row key with Storm stopped and (2 CPU, 15
>>>>>> GB
>>>>>> >>> >>> RAM):
>>>>>> >>> >>> 0.0150 seconds. :)
>>>>>> >>> >>>
>>>>>> >>> >>> So, the difference seems to be many fold of what native HBase
>>>>>> is
>>>>>> >>> >>> providing to us. I am not able to understand how it can be
>>>>>> possible?
>>>>>> >>> >>> What I
>>>>>> >>> >>> am missing here?
>>>>>> >>> >>>
>>>>>> >>> >>> --
>>>>>> >>> >>> Regards,
>>>>>> >>> >>> Vikas Agarwal
>>>>>> >>> >>> 91 – 9928301411
>>>>>> >>> >>>
>>>>>> >>> >>> InfoObjects, Inc.
>>>>>> >>> >>> Execution Matters
>>>>>> >>> >>> http://www.infoobjects.com
>>>>>> >>> >>> 2041 Mission College Boulevard, #280
>>>>>> >>> >>> Santa Clara, CA 95054
>>>>>> >>> >>> +1 (408) 988-2000 Work
>>>>>> >>> >>> +1 (408) 716-2726 Fax
>>>>>> >>> >>
>>>>>> >>> >>
>>>>>> >>> >
>>>>>> >>> >
>>>>>> >>> > --
>>>>>> >>> > Regards,
>>>>>> >>> > Vikas Agarwal
>>>>>> >>> > 91 – 9928301411
>>>>>> >>> >
>>>>>> >>> > InfoObjects, Inc.
>>>>>> >>> > Execution Matters
>>>>>> >>> > http://www.infoobjects.com
>>>>>> >>> > 2041 Mission College Boulevard, #280
>>>>>> >>> > Santa Clara, CA 95054
>>>>>> >>> > +1 (408) 988-2000 Work
>>>>>> >>> > +1 (408) 716-2726 Fax
>>>>>> >>> >
>>>>>> >>> >
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> --
>>>>>> >> Regards,
>>>>>> >> Vikas Agarwal
>>>>>> >> 91 – 9928301411
>>>>>> >>
>>>>>> >> InfoObjects, Inc.
>>>>>> >> Execution Matters
>>>>>> >> http://www.infoobjects.com
>>>>>> >> 2041 Mission College Boulevard, #280
>>>>>> >> Santa Clara, CA 95054
>>>>>> >> +1 (408) 988-2000 Work
>>>>>> >> +1 (408) 716-2726 Fax
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > --
>>>>>> > Regards,
>>>>>> > Vikas Agarwal
>>>>>> > 91 – 9928301411
>>>>>> >
>>>>>> > InfoObjects, Inc.
>>>>>> > Execution Matters
>>>>>> > http://www.infoobjects.com
>>>>>> > 2041 Mission College Boulevard, #280
>>>>>> > Santa Clara, CA 95054
>>>>>> > +1 (408) 988-2000 Work
>>>>>> > +1 (408) 716-2726 Fax
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Regards,
>>>>> Vikas Agarwal
>>>>> 91 – 9928301411
>>>>>
>>>>> InfoObjects, Inc.
>>>>> Execution Matters
>>>>> http://www.infoobjects.com
>>>>> 2041 Mission College Boulevard, #280
>>>>> Santa Clara, CA 95054
>>>>> +1 (408) 988-2000 Work
>>>>> +1 (408) 716-2726 Fax
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Regards,
>>>> Vikas Agarwal
>>>> 91 – 9928301411
>>>>
>>>> InfoObjects, Inc.
>>>> Execution Matters
>>>> http://www.infoobjects.com
>>>> 2041 Mission College Boulevard, #280
>>>> Santa Clara, CA 95054
>>>> +1 (408) 988-2000 Work
>>>> +1 (408) 716-2726 Fax
>>>>
>>>>
>>>
>>
>>
>> --
>> Regards,
>> Vikas Agarwal
>> 91 – 9928301411
>>
>> InfoObjects, Inc.
>> Execution Matters
>> http://www.infoobjects.com
>> 2041 Mission College Boulevard, #280
>> Santa Clara, CA 95054
>> +1 (408) 988-2000 Work
>> +1 (408) 716-2726 Fax
>>
>>
>


-- 
Regards,
Vikas Agarwal
91 – 9928301411

InfoObjects, Inc.
Execution Matters
http://www.infoobjects.com
2041 Mission College Boulevard, #280
Santa Clara, CA 95054
+1 (408) 988-2000 Work
+1 (408) 716-2726 Fax

Re: Phoenix response time

Posted by Alex Kamil <al...@gmail.com>.
well it is still network attached, If you allocate enough heap to fit the
whole thing in memory (in hbase/conf/hbase-env.sh) you could probably
eliminate this as a possible reason


On Sat, Sep 6, 2014 at 2:43 AM, Vikas Agarwal <vi...@infoobjects.com> wrote:

> EBS but with new generation SSD not magnetic one.
>
>
> On Sat, Sep 6, 2014 at 12:11 PM, Alex Kamil <al...@gmail.com> wrote:
>
>> do you use EBS or ephemeral storage, I found EBS performance to be
>> somewhat unpredictable
>>
>>
>> On Sat, Sep 6, 2014 at 2:37 AM, Vikas Agarwal <vi...@infoobjects.com>
>> wrote:
>>
>>> Hbase is 0.98.0
>>> Phoenix is 4.0
>>>
>>>
>>> On Sat, Sep 6, 2014 at 12:04 PM, Vikas Agarwal <vi...@infoobjects.com>
>>> wrote:
>>>
>>>> Yes, that is why it is a trouble for me. However, on contrary, HBase
>>>> shell is also on the same machine and same environment, so if it is an
>>>> issue of resource (CPU or memory) it should have affected the HBase too,
>>>> but HBase is able to give me results within 0.0150 seconds. :(
>>>>
>>>> No, I haven't tested it outside AWS. I guess, it should not be the case
>>>> due to much better performance by native HBase query on HBase shell.
>>>>
>>>>
>>>> On Sat, Sep 6, 2014 at 11:59 AM, James Taylor <ja...@apache.org>
>>>> wrote:
>>>>
>>>>> Something is up in your environment. What version of Phoenix and HBase
>>>>> are you using and in what environment? Have you tried this locally,
>>>>> outside of AWS to compare?
>>>>>
>>>>> Take a look at our perf numbers, generated more-or-less daily, and
>>>>> which run over more data that what you're testing against:
>>>>>
>>>>> http://phoenix-bin.github.io/client/performance/phoenix-20140904095313.htm
>>>>>
>>>>> Some of these are point queries and they take in the neighborhood of
>>>>> 0.01 seconds.
>>>>>
>>>>> Thanks,
>>>>> James
>>>>>
>>>>> On Fri, Sep 5, 2014 at 10:48 PM, Vikas Agarwal <vi...@infoobjects.com>
>>>>> wrote:
>>>>> > Missed to mention that count query (posted in my last mail) is also
>>>>> taking
>>>>> > very long time to return the count.
>>>>> >
>>>>> >
>>>>> > On Sat, Sep 6, 2014 at 11:17 AM, Vikas Agarwal <
>>>>> vikas@infoobjects.com>
>>>>> > wrote:
>>>>> >>
>>>>> >> As I mentioned, schema is nothing but bunch of fields (some being
>>>>> >> integers, longs and text) along with primary key (row key) and I am
>>>>> making
>>>>> >> simple query to get result for a particular primary key, nothing
>>>>> more than
>>>>> >> that.
>>>>> >>
>>>>> >> 0: jdbc:phoenix:localhost> SELECT count(1) FROM table_name;
>>>>> >>
>>>>> >> +------------+
>>>>> >>
>>>>> >> |  COUNT(1)  |
>>>>> >>
>>>>> >> +------------+
>>>>> >>
>>>>> >> | 4667515    |
>>>>> >>
>>>>> >> +------------+
>>>>> >>
>>>>> >> 1 row selected (132.11 seconds)
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >> On Sat, Sep 6, 2014 at 11:09 AM, Puneet Kumar Ojha
>>>>> >> <pu...@pubmatic.com> wrote:
>>>>> >>>
>>>>> >>> If you can share the schema,data type,cardinality of each
>>>>> dimension and
>>>>> >>> usual queries, I can help to design a schema with performance of
>>>>> less than 1
>>>>> >>> sec using Phoenix.
>>>>> >>>
>>>>> >>>
>>>>> >>>
>>>>> >>> Thanks
>>>>> >>>
>>>>> >>>
>>>>> >>>
>>>>> >>>
>>>>> >>>
>>>>> >>> ------ Original message------
>>>>> >>>
>>>>> >>> From: James Taylor
>>>>> >>>
>>>>> >>> Date: Sat, Sep 6, 2014 10:15 AM
>>>>> >>>
>>>>> >>> To: user;
>>>>> >>>
>>>>> >>> Subject:Re: Phoenix response time
>>>>> >>>
>>>>> >>>
>>>>> >>>
>>>>> >>> Vikas,
>>>>> >>> Please post your schema and query.
>>>>> >>> Thanks,
>>>>> >>> James
>>>>> >>>
>>>>> >>> On Fri, Sep 5, 2014 at 9:18 PM, Vikas Agarwal <
>>>>> vikas@infoobjects.com>
>>>>> >>> wrote:
>>>>> >>> > Ours is also a single node setup right now and as of now there
>>>>> are less
>>>>> >>> > than
>>>>> >>> > 1 million rows which is expected to grow around 100m at minimum.
>>>>> >>> >
>>>>> >>> > I am aware of secondary indexes but when I am querying on
>>>>> primary/row
>>>>> >>> > key,
>>>>> >>> > why would it take so much time?
>>>>> >>> >
>>>>> >>> > I am directly querying using sqlline for Phoenix and hbase shell
>>>>> for
>>>>> >>> > HBase
>>>>> >>> > query. I am not expecting to do any fine tuning for such small
>>>>> dataset.
>>>>> >>> > I am
>>>>> >>> > assumimg a minimum performance level out of the box.
>>>>> >>> >
>>>>> >>> > On Friday, September 5, 2014, yeshwanth kumar <
>>>>> yeshwanth43@gmail.com>
>>>>> >>> > wrote:
>>>>> >>> >>
>>>>> >>> >> hi vikas,
>>>>> >>> >>
>>>>> >>> >> we used phoenix on a 4 core/23Gb machine, as a single node
>>>>> setup.
>>>>> >>> >> used HDP 2.1
>>>>> >>> >> our table has 50-70M rows,
>>>>> >>> >> select on that table took less than 2 seconds.
>>>>> >>> >> Aggregation queries took less than 8 seconds.
>>>>> >>> >> for achieving good performance we created secondary index on the
>>>>> >>> >> table.
>>>>> >>> >>
>>>>> >>> >> make sure you finetuned hbase,
>>>>> >>> >> enabling compression on the data makes a difference in response.
>>>>> >>> >> if u distribute the data and load over all regions in hbase,
>>>>> >>> >> look at the performance tips mentioned in phoenix blog
>>>>> >>> >>
>>>>> >>> >> -yeshwanth
>>>>> >>> >>
>>>>> >>> >>
>>>>> >>> >>
>>>>> >>> >> Cheers,
>>>>> >>> >> Yeshwanth
>>>>> >>> >>
>>>>> >>> >>
>>>>> >>> >>
>>>>> >>> >> On Fri, Sep 5, 2014 at 5:42 PM, Vikas Agarwal <
>>>>> vikas@infoobjects.com>
>>>>> >>> >> wrote:
>>>>> >>> >>>
>>>>> >>> >>> Hi,
>>>>> >>> >>>
>>>>> >>> >>> Preface: We are testing phoenix using Hortonworks distribution
>>>>> for
>>>>> >>> >>> HBase
>>>>> >>> >>> on Amazon EC2 instance (r3.large, 2 CPU/15 GB RAM).
>>>>> >>> >>>
>>>>> >>> >>> With contrast to performance benchmarks, I found Phoenix to be
>>>>> very
>>>>> >>> >>> slow
>>>>> >>> >>> in querying even on primary key or row key. So, tried to
>>>>> increase the
>>>>> >>> >>> RAM
>>>>> >>> >>> for HBase and Phoenix and increasing the CPU and RAM by
>>>>> upgrading the
>>>>> >>> >>> EC2
>>>>> >>> >>> machine type to r3.xlarge (4 CPU, 30 GB RAM). Results were
>>>>> like this:
>>>>> >>> >>>
>>>>> >>> >>> Time takes in returning result of query on row key:
>>>>> >>> >>> With Storm running and very less RAM available: 50 sec
>>>>> >>> >>>
>>>>> >>> >>> With Storm stopped and RAM available to Phoenix and HBase: 18
>>>>> sec
>>>>> >>> >>>
>>>>> >>> >>> With new machine of next higher category (4 CPU and 30 GB
>>>>> RAM): 8 sec
>>>>> >>> >>>
>>>>> >>> >>> Pure HBase query by row key with Storm stopped and (2 CPU, 15
>>>>> GB
>>>>> >>> >>> RAM):
>>>>> >>> >>> 0.0150 seconds. :)
>>>>> >>> >>>
>>>>> >>> >>> So, the difference seems to be many fold of what native HBase
>>>>> is
>>>>> >>> >>> providing to us. I am not able to understand how it can be
>>>>> possible?
>>>>> >>> >>> What I
>>>>> >>> >>> am missing here?
>>>>> >>> >>>
>>>>> >>> >>> --
>>>>> >>> >>> Regards,
>>>>> >>> >>> Vikas Agarwal
>>>>> >>> >>> 91 – 9928301411
>>>>> >>> >>>
>>>>> >>> >>> InfoObjects, Inc.
>>>>> >>> >>> Execution Matters
>>>>> >>> >>> http://www.infoobjects.com
>>>>> >>> >>> 2041 Mission College Boulevard, #280
>>>>> >>> >>> Santa Clara, CA 95054
>>>>> >>> >>> +1 (408) 988-2000 Work
>>>>> >>> >>> +1 (408) 716-2726 Fax
>>>>> >>> >>
>>>>> >>> >>
>>>>> >>> >
>>>>> >>> >
>>>>> >>> > --
>>>>> >>> > Regards,
>>>>> >>> > Vikas Agarwal
>>>>> >>> > 91 – 9928301411
>>>>> >>> >
>>>>> >>> > InfoObjects, Inc.
>>>>> >>> > Execution Matters
>>>>> >>> > http://www.infoobjects.com
>>>>> >>> > 2041 Mission College Boulevard, #280
>>>>> >>> > Santa Clara, CA 95054
>>>>> >>> > +1 (408) 988-2000 Work
>>>>> >>> > +1 (408) 716-2726 Fax
>>>>> >>> >
>>>>> >>> >
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >> --
>>>>> >> Regards,
>>>>> >> Vikas Agarwal
>>>>> >> 91 – 9928301411
>>>>> >>
>>>>> >> InfoObjects, Inc.
>>>>> >> Execution Matters
>>>>> >> http://www.infoobjects.com
>>>>> >> 2041 Mission College Boulevard, #280
>>>>> >> Santa Clara, CA 95054
>>>>> >> +1 (408) 988-2000 Work
>>>>> >> +1 (408) 716-2726 Fax
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > --
>>>>> > Regards,
>>>>> > Vikas Agarwal
>>>>> > 91 – 9928301411
>>>>> >
>>>>> > InfoObjects, Inc.
>>>>> > Execution Matters
>>>>> > http://www.infoobjects.com
>>>>> > 2041 Mission College Boulevard, #280
>>>>> > Santa Clara, CA 95054
>>>>> > +1 (408) 988-2000 Work
>>>>> > +1 (408) 716-2726 Fax
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Regards,
>>>> Vikas Agarwal
>>>> 91 – 9928301411
>>>>
>>>> InfoObjects, Inc.
>>>> Execution Matters
>>>> http://www.infoobjects.com
>>>> 2041 Mission College Boulevard, #280
>>>> Santa Clara, CA 95054
>>>> +1 (408) 988-2000 Work
>>>> +1 (408) 716-2726 Fax
>>>>
>>>>
>>>
>>>
>>> --
>>> Regards,
>>> Vikas Agarwal
>>> 91 – 9928301411
>>>
>>> InfoObjects, Inc.
>>> Execution Matters
>>> http://www.infoobjects.com
>>> 2041 Mission College Boulevard, #280
>>> Santa Clara, CA 95054
>>> +1 (408) 988-2000 Work
>>> +1 (408) 716-2726 Fax
>>>
>>>
>>
>
>
> --
> Regards,
> Vikas Agarwal
> 91 – 9928301411
>
> InfoObjects, Inc.
> Execution Matters
> http://www.infoobjects.com
> 2041 Mission College Boulevard, #280
> Santa Clara, CA 95054
> +1 (408) 988-2000 Work
> +1 (408) 716-2726 Fax
>
>

Re: Phoenix response time

Posted by Vikas Agarwal <vi...@infoobjects.com>.
EBS but with new generation SSD not magnetic one.


On Sat, Sep 6, 2014 at 12:11 PM, Alex Kamil <al...@gmail.com> wrote:

> do you use EBS or ephemeral storage, I found EBS performance to be
> somewhat unpredictable
>
>
> On Sat, Sep 6, 2014 at 2:37 AM, Vikas Agarwal <vi...@infoobjects.com>
> wrote:
>
>> Hbase is 0.98.0
>> Phoenix is 4.0
>>
>>
>> On Sat, Sep 6, 2014 at 12:04 PM, Vikas Agarwal <vi...@infoobjects.com>
>> wrote:
>>
>>> Yes, that is why it is a trouble for me. However, on contrary, HBase
>>> shell is also on the same machine and same environment, so if it is an
>>> issue of resource (CPU or memory) it should have affected the HBase too,
>>> but HBase is able to give me results within 0.0150 seconds. :(
>>>
>>> No, I haven't tested it outside AWS. I guess, it should not be the case
>>> due to much better performance by native HBase query on HBase shell.
>>>
>>>
>>> On Sat, Sep 6, 2014 at 11:59 AM, James Taylor <ja...@apache.org>
>>> wrote:
>>>
>>>> Something is up in your environment. What version of Phoenix and HBase
>>>> are you using and in what environment? Have you tried this locally,
>>>> outside of AWS to compare?
>>>>
>>>> Take a look at our perf numbers, generated more-or-less daily, and
>>>> which run over more data that what you're testing against:
>>>>
>>>> http://phoenix-bin.github.io/client/performance/phoenix-20140904095313.htm
>>>>
>>>> Some of these are point queries and they take in the neighborhood of
>>>> 0.01 seconds.
>>>>
>>>> Thanks,
>>>> James
>>>>
>>>> On Fri, Sep 5, 2014 at 10:48 PM, Vikas Agarwal <vi...@infoobjects.com>
>>>> wrote:
>>>> > Missed to mention that count query (posted in my last mail) is also
>>>> taking
>>>> > very long time to return the count.
>>>> >
>>>> >
>>>> > On Sat, Sep 6, 2014 at 11:17 AM, Vikas Agarwal <vikas@infoobjects.com
>>>> >
>>>> > wrote:
>>>> >>
>>>> >> As I mentioned, schema is nothing but bunch of fields (some being
>>>> >> integers, longs and text) along with primary key (row key) and I am
>>>> making
>>>> >> simple query to get result for a particular primary key, nothing
>>>> more than
>>>> >> that.
>>>> >>
>>>> >> 0: jdbc:phoenix:localhost> SELECT count(1) FROM table_name;
>>>> >>
>>>> >> +------------+
>>>> >>
>>>> >> |  COUNT(1)  |
>>>> >>
>>>> >> +------------+
>>>> >>
>>>> >> | 4667515    |
>>>> >>
>>>> >> +------------+
>>>> >>
>>>> >> 1 row selected (132.11 seconds)
>>>> >>
>>>> >>
>>>> >>
>>>> >> On Sat, Sep 6, 2014 at 11:09 AM, Puneet Kumar Ojha
>>>> >> <pu...@pubmatic.com> wrote:
>>>> >>>
>>>> >>> If you can share the schema,data type,cardinality of each dimension
>>>> and
>>>> >>> usual queries, I can help to design a schema with performance of
>>>> less than 1
>>>> >>> sec using Phoenix.
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> Thanks
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> ------ Original message------
>>>> >>>
>>>> >>> From: James Taylor
>>>> >>>
>>>> >>> Date: Sat, Sep 6, 2014 10:15 AM
>>>> >>>
>>>> >>> To: user;
>>>> >>>
>>>> >>> Subject:Re: Phoenix response time
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> Vikas,
>>>> >>> Please post your schema and query.
>>>> >>> Thanks,
>>>> >>> James
>>>> >>>
>>>> >>> On Fri, Sep 5, 2014 at 9:18 PM, Vikas Agarwal <
>>>> vikas@infoobjects.com>
>>>> >>> wrote:
>>>> >>> > Ours is also a single node setup right now and as of now there
>>>> are less
>>>> >>> > than
>>>> >>> > 1 million rows which is expected to grow around 100m at minimum.
>>>> >>> >
>>>> >>> > I am aware of secondary indexes but when I am querying on
>>>> primary/row
>>>> >>> > key,
>>>> >>> > why would it take so much time?
>>>> >>> >
>>>> >>> > I am directly querying using sqlline for Phoenix and hbase shell
>>>> for
>>>> >>> > HBase
>>>> >>> > query. I am not expecting to do any fine tuning for such small
>>>> dataset.
>>>> >>> > I am
>>>> >>> > assumimg a minimum performance level out of the box.
>>>> >>> >
>>>> >>> > On Friday, September 5, 2014, yeshwanth kumar <
>>>> yeshwanth43@gmail.com>
>>>> >>> > wrote:
>>>> >>> >>
>>>> >>> >> hi vikas,
>>>> >>> >>
>>>> >>> >> we used phoenix on a 4 core/23Gb machine, as a single node setup.
>>>> >>> >> used HDP 2.1
>>>> >>> >> our table has 50-70M rows,
>>>> >>> >> select on that table took less than 2 seconds.
>>>> >>> >> Aggregation queries took less than 8 seconds.
>>>> >>> >> for achieving good performance we created secondary index on the
>>>> >>> >> table.
>>>> >>> >>
>>>> >>> >> make sure you finetuned hbase,
>>>> >>> >> enabling compression on the data makes a difference in response.
>>>> >>> >> if u distribute the data and load over all regions in hbase,
>>>> >>> >> look at the performance tips mentioned in phoenix blog
>>>> >>> >>
>>>> >>> >> -yeshwanth
>>>> >>> >>
>>>> >>> >>
>>>> >>> >>
>>>> >>> >> Cheers,
>>>> >>> >> Yeshwanth
>>>> >>> >>
>>>> >>> >>
>>>> >>> >>
>>>> >>> >> On Fri, Sep 5, 2014 at 5:42 PM, Vikas Agarwal <
>>>> vikas@infoobjects.com>
>>>> >>> >> wrote:
>>>> >>> >>>
>>>> >>> >>> Hi,
>>>> >>> >>>
>>>> >>> >>> Preface: We are testing phoenix using Hortonworks distribution
>>>> for
>>>> >>> >>> HBase
>>>> >>> >>> on Amazon EC2 instance (r3.large, 2 CPU/15 GB RAM).
>>>> >>> >>>
>>>> >>> >>> With contrast to performance benchmarks, I found Phoenix to be
>>>> very
>>>> >>> >>> slow
>>>> >>> >>> in querying even on primary key or row key. So, tried to
>>>> increase the
>>>> >>> >>> RAM
>>>> >>> >>> for HBase and Phoenix and increasing the CPU and RAM by
>>>> upgrading the
>>>> >>> >>> EC2
>>>> >>> >>> machine type to r3.xlarge (4 CPU, 30 GB RAM). Results were like
>>>> this:
>>>> >>> >>>
>>>> >>> >>> Time takes in returning result of query on row key:
>>>> >>> >>> With Storm running and very less RAM available: 50 sec
>>>> >>> >>>
>>>> >>> >>> With Storm stopped and RAM available to Phoenix and HBase: 18
>>>> sec
>>>> >>> >>>
>>>> >>> >>> With new machine of next higher category (4 CPU and 30 GB RAM):
>>>> 8 sec
>>>> >>> >>>
>>>> >>> >>> Pure HBase query by row key with Storm stopped and (2 CPU, 15 GB
>>>> >>> >>> RAM):
>>>> >>> >>> 0.0150 seconds. :)
>>>> >>> >>>
>>>> >>> >>> So, the difference seems to be many fold of what native HBase is
>>>> >>> >>> providing to us. I am not able to understand how it can be
>>>> possible?
>>>> >>> >>> What I
>>>> >>> >>> am missing here?
>>>> >>> >>>
>>>> >>> >>> --
>>>> >>> >>> Regards,
>>>> >>> >>> Vikas Agarwal
>>>> >>> >>> 91 – 9928301411
>>>> >>> >>>
>>>> >>> >>> InfoObjects, Inc.
>>>> >>> >>> Execution Matters
>>>> >>> >>> http://www.infoobjects.com
>>>> >>> >>> 2041 Mission College Boulevard, #280
>>>> >>> >>> Santa Clara, CA 95054
>>>> >>> >>> +1 (408) 988-2000 Work
>>>> >>> >>> +1 (408) 716-2726 Fax
>>>> >>> >>
>>>> >>> >>
>>>> >>> >
>>>> >>> >
>>>> >>> > --
>>>> >>> > Regards,
>>>> >>> > Vikas Agarwal
>>>> >>> > 91 – 9928301411
>>>> >>> >
>>>> >>> > InfoObjects, Inc.
>>>> >>> > Execution Matters
>>>> >>> > http://www.infoobjects.com
>>>> >>> > 2041 Mission College Boulevard, #280
>>>> >>> > Santa Clara, CA 95054
>>>> >>> > +1 (408) 988-2000 Work
>>>> >>> > +1 (408) 716-2726 Fax
>>>> >>> >
>>>> >>> >
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >> --
>>>> >> Regards,
>>>> >> Vikas Agarwal
>>>> >> 91 – 9928301411
>>>> >>
>>>> >> InfoObjects, Inc.
>>>> >> Execution Matters
>>>> >> http://www.infoobjects.com
>>>> >> 2041 Mission College Boulevard, #280
>>>> >> Santa Clara, CA 95054
>>>> >> +1 (408) 988-2000 Work
>>>> >> +1 (408) 716-2726 Fax
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> > Regards,
>>>> > Vikas Agarwal
>>>> > 91 – 9928301411
>>>> >
>>>> > InfoObjects, Inc.
>>>> > Execution Matters
>>>> > http://www.infoobjects.com
>>>> > 2041 Mission College Boulevard, #280
>>>> > Santa Clara, CA 95054
>>>> > +1 (408) 988-2000 Work
>>>> > +1 (408) 716-2726 Fax
>>>>
>>>
>>>
>>>
>>> --
>>> Regards,
>>> Vikas Agarwal
>>> 91 – 9928301411
>>>
>>> InfoObjects, Inc.
>>> Execution Matters
>>> http://www.infoobjects.com
>>> 2041 Mission College Boulevard, #280
>>> Santa Clara, CA 95054
>>> +1 (408) 988-2000 Work
>>> +1 (408) 716-2726 Fax
>>>
>>>
>>
>>
>> --
>> Regards,
>> Vikas Agarwal
>> 91 – 9928301411
>>
>> InfoObjects, Inc.
>> Execution Matters
>> http://www.infoobjects.com
>> 2041 Mission College Boulevard, #280
>> Santa Clara, CA 95054
>> +1 (408) 988-2000 Work
>> +1 (408) 716-2726 Fax
>>
>>
>


-- 
Regards,
Vikas Agarwal
91 – 9928301411

InfoObjects, Inc.
Execution Matters
http://www.infoobjects.com
2041 Mission College Boulevard, #280
Santa Clara, CA 95054
+1 (408) 988-2000 Work
+1 (408) 716-2726 Fax

Re: Phoenix response time

Posted by Alex Kamil <al...@gmail.com>.
do you use EBS or ephemeral storage, I found EBS performance to be somewhat
unpredictable


On Sat, Sep 6, 2014 at 2:37 AM, Vikas Agarwal <vi...@infoobjects.com> wrote:

> Hbase is 0.98.0
> Phoenix is 4.0
>
>
> On Sat, Sep 6, 2014 at 12:04 PM, Vikas Agarwal <vi...@infoobjects.com>
> wrote:
>
>> Yes, that is why it is a trouble for me. However, on contrary, HBase
>> shell is also on the same machine and same environment, so if it is an
>> issue of resource (CPU or memory) it should have affected the HBase too,
>> but HBase is able to give me results within 0.0150 seconds. :(
>>
>> No, I haven't tested it outside AWS. I guess, it should not be the case
>> due to much better performance by native HBase query on HBase shell.
>>
>>
>> On Sat, Sep 6, 2014 at 11:59 AM, James Taylor <ja...@apache.org>
>> wrote:
>>
>>> Something is up in your environment. What version of Phoenix and HBase
>>> are you using and in what environment? Have you tried this locally,
>>> outside of AWS to compare?
>>>
>>> Take a look at our perf numbers, generated more-or-less daily, and
>>> which run over more data that what you're testing against:
>>>
>>> http://phoenix-bin.github.io/client/performance/phoenix-20140904095313.htm
>>>
>>> Some of these are point queries and they take in the neighborhood of
>>> 0.01 seconds.
>>>
>>> Thanks,
>>> James
>>>
>>> On Fri, Sep 5, 2014 at 10:48 PM, Vikas Agarwal <vi...@infoobjects.com>
>>> wrote:
>>> > Missed to mention that count query (posted in my last mail) is also
>>> taking
>>> > very long time to return the count.
>>> >
>>> >
>>> > On Sat, Sep 6, 2014 at 11:17 AM, Vikas Agarwal <vi...@infoobjects.com>
>>> > wrote:
>>> >>
>>> >> As I mentioned, schema is nothing but bunch of fields (some being
>>> >> integers, longs and text) along with primary key (row key) and I am
>>> making
>>> >> simple query to get result for a particular primary key, nothing more
>>> than
>>> >> that.
>>> >>
>>> >> 0: jdbc:phoenix:localhost> SELECT count(1) FROM table_name;
>>> >>
>>> >> +------------+
>>> >>
>>> >> |  COUNT(1)  |
>>> >>
>>> >> +------------+
>>> >>
>>> >> | 4667515    |
>>> >>
>>> >> +------------+
>>> >>
>>> >> 1 row selected (132.11 seconds)
>>> >>
>>> >>
>>> >>
>>> >> On Sat, Sep 6, 2014 at 11:09 AM, Puneet Kumar Ojha
>>> >> <pu...@pubmatic.com> wrote:
>>> >>>
>>> >>> If you can share the schema,data type,cardinality of each dimension
>>> and
>>> >>> usual queries, I can help to design a schema with performance of
>>> less than 1
>>> >>> sec using Phoenix.
>>> >>>
>>> >>>
>>> >>>
>>> >>> Thanks
>>> >>>
>>> >>>
>>> >>>
>>> >>>
>>> >>>
>>> >>> ------ Original message------
>>> >>>
>>> >>> From: James Taylor
>>> >>>
>>> >>> Date: Sat, Sep 6, 2014 10:15 AM
>>> >>>
>>> >>> To: user;
>>> >>>
>>> >>> Subject:Re: Phoenix response time
>>> >>>
>>> >>>
>>> >>>
>>> >>> Vikas,
>>> >>> Please post your schema and query.
>>> >>> Thanks,
>>> >>> James
>>> >>>
>>> >>> On Fri, Sep 5, 2014 at 9:18 PM, Vikas Agarwal <vikas@infoobjects.com
>>> >
>>> >>> wrote:
>>> >>> > Ours is also a single node setup right now and as of now there are
>>> less
>>> >>> > than
>>> >>> > 1 million rows which is expected to grow around 100m at minimum.
>>> >>> >
>>> >>> > I am aware of secondary indexes but when I am querying on
>>> primary/row
>>> >>> > key,
>>> >>> > why would it take so much time?
>>> >>> >
>>> >>> > I am directly querying using sqlline for Phoenix and hbase shell
>>> for
>>> >>> > HBase
>>> >>> > query. I am not expecting to do any fine tuning for such small
>>> dataset.
>>> >>> > I am
>>> >>> > assumimg a minimum performance level out of the box.
>>> >>> >
>>> >>> > On Friday, September 5, 2014, yeshwanth kumar <
>>> yeshwanth43@gmail.com>
>>> >>> > wrote:
>>> >>> >>
>>> >>> >> hi vikas,
>>> >>> >>
>>> >>> >> we used phoenix on a 4 core/23Gb machine, as a single node setup.
>>> >>> >> used HDP 2.1
>>> >>> >> our table has 50-70M rows,
>>> >>> >> select on that table took less than 2 seconds.
>>> >>> >> Aggregation queries took less than 8 seconds.
>>> >>> >> for achieving good performance we created secondary index on the
>>> >>> >> table.
>>> >>> >>
>>> >>> >> make sure you finetuned hbase,
>>> >>> >> enabling compression on the data makes a difference in response.
>>> >>> >> if u distribute the data and load over all regions in hbase,
>>> >>> >> look at the performance tips mentioned in phoenix blog
>>> >>> >>
>>> >>> >> -yeshwanth
>>> >>> >>
>>> >>> >>
>>> >>> >>
>>> >>> >> Cheers,
>>> >>> >> Yeshwanth
>>> >>> >>
>>> >>> >>
>>> >>> >>
>>> >>> >> On Fri, Sep 5, 2014 at 5:42 PM, Vikas Agarwal <
>>> vikas@infoobjects.com>
>>> >>> >> wrote:
>>> >>> >>>
>>> >>> >>> Hi,
>>> >>> >>>
>>> >>> >>> Preface: We are testing phoenix using Hortonworks distribution
>>> for
>>> >>> >>> HBase
>>> >>> >>> on Amazon EC2 instance (r3.large, 2 CPU/15 GB RAM).
>>> >>> >>>
>>> >>> >>> With contrast to performance benchmarks, I found Phoenix to be
>>> very
>>> >>> >>> slow
>>> >>> >>> in querying even on primary key or row key. So, tried to
>>> increase the
>>> >>> >>> RAM
>>> >>> >>> for HBase and Phoenix and increasing the CPU and RAM by
>>> upgrading the
>>> >>> >>> EC2
>>> >>> >>> machine type to r3.xlarge (4 CPU, 30 GB RAM). Results were like
>>> this:
>>> >>> >>>
>>> >>> >>> Time takes in returning result of query on row key:
>>> >>> >>> With Storm running and very less RAM available: 50 sec
>>> >>> >>>
>>> >>> >>> With Storm stopped and RAM available to Phoenix and HBase: 18 sec
>>> >>> >>>
>>> >>> >>> With new machine of next higher category (4 CPU and 30 GB RAM):
>>> 8 sec
>>> >>> >>>
>>> >>> >>> Pure HBase query by row key with Storm stopped and (2 CPU, 15 GB
>>> >>> >>> RAM):
>>> >>> >>> 0.0150 seconds. :)
>>> >>> >>>
>>> >>> >>> So, the difference seems to be many fold of what native HBase is
>>> >>> >>> providing to us. I am not able to understand how it can be
>>> possible?
>>> >>> >>> What I
>>> >>> >>> am missing here?
>>> >>> >>>
>>> >>> >>> --
>>> >>> >>> Regards,
>>> >>> >>> Vikas Agarwal
>>> >>> >>> 91 – 9928301411
>>> >>> >>>
>>> >>> >>> InfoObjects, Inc.
>>> >>> >>> Execution Matters
>>> >>> >>> http://www.infoobjects.com
>>> >>> >>> 2041 Mission College Boulevard, #280
>>> >>> >>> Santa Clara, CA 95054
>>> >>> >>> +1 (408) 988-2000 Work
>>> >>> >>> +1 (408) 716-2726 Fax
>>> >>> >>
>>> >>> >>
>>> >>> >
>>> >>> >
>>> >>> > --
>>> >>> > Regards,
>>> >>> > Vikas Agarwal
>>> >>> > 91 – 9928301411
>>> >>> >
>>> >>> > InfoObjects, Inc.
>>> >>> > Execution Matters
>>> >>> > http://www.infoobjects.com
>>> >>> > 2041 Mission College Boulevard, #280
>>> >>> > Santa Clara, CA 95054
>>> >>> > +1 (408) 988-2000 Work
>>> >>> > +1 (408) 716-2726 Fax
>>> >>> >
>>> >>> >
>>> >>
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Regards,
>>> >> Vikas Agarwal
>>> >> 91 – 9928301411
>>> >>
>>> >> InfoObjects, Inc.
>>> >> Execution Matters
>>> >> http://www.infoobjects.com
>>> >> 2041 Mission College Boulevard, #280
>>> >> Santa Clara, CA 95054
>>> >> +1 (408) 988-2000 Work
>>> >> +1 (408) 716-2726 Fax
>>> >
>>> >
>>> >
>>> >
>>> > --
>>> > Regards,
>>> > Vikas Agarwal
>>> > 91 – 9928301411
>>> >
>>> > InfoObjects, Inc.
>>> > Execution Matters
>>> > http://www.infoobjects.com
>>> > 2041 Mission College Boulevard, #280
>>> > Santa Clara, CA 95054
>>> > +1 (408) 988-2000 Work
>>> > +1 (408) 716-2726 Fax
>>>
>>
>>
>>
>> --
>> Regards,
>> Vikas Agarwal
>> 91 – 9928301411
>>
>> InfoObjects, Inc.
>> Execution Matters
>> http://www.infoobjects.com
>> 2041 Mission College Boulevard, #280
>> Santa Clara, CA 95054
>> +1 (408) 988-2000 Work
>> +1 (408) 716-2726 Fax
>>
>>
>
>
> --
> Regards,
> Vikas Agarwal
> 91 – 9928301411
>
> InfoObjects, Inc.
> Execution Matters
> http://www.infoobjects.com
> 2041 Mission College Boulevard, #280
> Santa Clara, CA 95054
> +1 (408) 988-2000 Work
> +1 (408) 716-2726 Fax
>
>

Re: Phoenix response time

Posted by Vikas Agarwal <vi...@infoobjects.com>.
Hbase is 0.98.0
Phoenix is 4.0


On Sat, Sep 6, 2014 at 12:04 PM, Vikas Agarwal <vi...@infoobjects.com>
wrote:

> Yes, that is why it is a trouble for me. However, on contrary, HBase shell
> is also on the same machine and same environment, so if it is an issue of
> resource (CPU or memory) it should have affected the HBase too, but HBase
> is able to give me results within 0.0150 seconds. :(
>
> No, I haven't tested it outside AWS. I guess, it should not be the case
> due to much better performance by native HBase query on HBase shell.
>
>
> On Sat, Sep 6, 2014 at 11:59 AM, James Taylor <ja...@apache.org>
> wrote:
>
>> Something is up in your environment. What version of Phoenix and HBase
>> are you using and in what environment? Have you tried this locally,
>> outside of AWS to compare?
>>
>> Take a look at our perf numbers, generated more-or-less daily, and
>> which run over more data that what you're testing against:
>> http://phoenix-bin.github.io/client/performance/phoenix-20140904095313.htm
>>
>> Some of these are point queries and they take in the neighborhood of
>> 0.01 seconds.
>>
>> Thanks,
>> James
>>
>> On Fri, Sep 5, 2014 at 10:48 PM, Vikas Agarwal <vi...@infoobjects.com>
>> wrote:
>> > Missed to mention that count query (posted in my last mail) is also
>> taking
>> > very long time to return the count.
>> >
>> >
>> > On Sat, Sep 6, 2014 at 11:17 AM, Vikas Agarwal <vi...@infoobjects.com>
>> > wrote:
>> >>
>> >> As I mentioned, schema is nothing but bunch of fields (some being
>> >> integers, longs and text) along with primary key (row key) and I am
>> making
>> >> simple query to get result for a particular primary key, nothing more
>> than
>> >> that.
>> >>
>> >> 0: jdbc:phoenix:localhost> SELECT count(1) FROM table_name;
>> >>
>> >> +------------+
>> >>
>> >> |  COUNT(1)  |
>> >>
>> >> +------------+
>> >>
>> >> | 4667515    |
>> >>
>> >> +------------+
>> >>
>> >> 1 row selected (132.11 seconds)
>> >>
>> >>
>> >>
>> >> On Sat, Sep 6, 2014 at 11:09 AM, Puneet Kumar Ojha
>> >> <pu...@pubmatic.com> wrote:
>> >>>
>> >>> If you can share the schema,data type,cardinality of each dimension
>> and
>> >>> usual queries, I can help to design a schema with performance of less
>> than 1
>> >>> sec using Phoenix.
>> >>>
>> >>>
>> >>>
>> >>> Thanks
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>> ------ Original message------
>> >>>
>> >>> From: James Taylor
>> >>>
>> >>> Date: Sat, Sep 6, 2014 10:15 AM
>> >>>
>> >>> To: user;
>> >>>
>> >>> Subject:Re: Phoenix response time
>> >>>
>> >>>
>> >>>
>> >>> Vikas,
>> >>> Please post your schema and query.
>> >>> Thanks,
>> >>> James
>> >>>
>> >>> On Fri, Sep 5, 2014 at 9:18 PM, Vikas Agarwal <vi...@infoobjects.com>
>> >>> wrote:
>> >>> > Ours is also a single node setup right now and as of now there are
>> less
>> >>> > than
>> >>> > 1 million rows which is expected to grow around 100m at minimum.
>> >>> >
>> >>> > I am aware of secondary indexes but when I am querying on
>> primary/row
>> >>> > key,
>> >>> > why would it take so much time?
>> >>> >
>> >>> > I am directly querying using sqlline for Phoenix and hbase shell for
>> >>> > HBase
>> >>> > query. I am not expecting to do any fine tuning for such small
>> dataset.
>> >>> > I am
>> >>> > assumimg a minimum performance level out of the box.
>> >>> >
>> >>> > On Friday, September 5, 2014, yeshwanth kumar <
>> yeshwanth43@gmail.com>
>> >>> > wrote:
>> >>> >>
>> >>> >> hi vikas,
>> >>> >>
>> >>> >> we used phoenix on a 4 core/23Gb machine, as a single node setup.
>> >>> >> used HDP 2.1
>> >>> >> our table has 50-70M rows,
>> >>> >> select on that table took less than 2 seconds.
>> >>> >> Aggregation queries took less than 8 seconds.
>> >>> >> for achieving good performance we created secondary index on the
>> >>> >> table.
>> >>> >>
>> >>> >> make sure you finetuned hbase,
>> >>> >> enabling compression on the data makes a difference in response.
>> >>> >> if u distribute the data and load over all regions in hbase,
>> >>> >> look at the performance tips mentioned in phoenix blog
>> >>> >>
>> >>> >> -yeshwanth
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >> Cheers,
>> >>> >> Yeshwanth
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >> On Fri, Sep 5, 2014 at 5:42 PM, Vikas Agarwal <
>> vikas@infoobjects.com>
>> >>> >> wrote:
>> >>> >>>
>> >>> >>> Hi,
>> >>> >>>
>> >>> >>> Preface: We are testing phoenix using Hortonworks distribution for
>> >>> >>> HBase
>> >>> >>> on Amazon EC2 instance (r3.large, 2 CPU/15 GB RAM).
>> >>> >>>
>> >>> >>> With contrast to performance benchmarks, I found Phoenix to be
>> very
>> >>> >>> slow
>> >>> >>> in querying even on primary key or row key. So, tried to increase
>> the
>> >>> >>> RAM
>> >>> >>> for HBase and Phoenix and increasing the CPU and RAM by upgrading
>> the
>> >>> >>> EC2
>> >>> >>> machine type to r3.xlarge (4 CPU, 30 GB RAM). Results were like
>> this:
>> >>> >>>
>> >>> >>> Time takes in returning result of query on row key:
>> >>> >>> With Storm running and very less RAM available: 50 sec
>> >>> >>>
>> >>> >>> With Storm stopped and RAM available to Phoenix and HBase: 18 sec
>> >>> >>>
>> >>> >>> With new machine of next higher category (4 CPU and 30 GB RAM): 8
>> sec
>> >>> >>>
>> >>> >>> Pure HBase query by row key with Storm stopped and (2 CPU, 15 GB
>> >>> >>> RAM):
>> >>> >>> 0.0150 seconds. :)
>> >>> >>>
>> >>> >>> So, the difference seems to be many fold of what native HBase is
>> >>> >>> providing to us. I am not able to understand how it can be
>> possible?
>> >>> >>> What I
>> >>> >>> am missing here?
>> >>> >>>
>> >>> >>> --
>> >>> >>> Regards,
>> >>> >>> Vikas Agarwal
>> >>> >>> 91 – 9928301411
>> >>> >>>
>> >>> >>> InfoObjects, Inc.
>> >>> >>> Execution Matters
>> >>> >>> http://www.infoobjects.com
>> >>> >>> 2041 Mission College Boulevard, #280
>> >>> >>> Santa Clara, CA 95054
>> >>> >>> +1 (408) 988-2000 Work
>> >>> >>> +1 (408) 716-2726 Fax
>> >>> >>
>> >>> >>
>> >>> >
>> >>> >
>> >>> > --
>> >>> > Regards,
>> >>> > Vikas Agarwal
>> >>> > 91 – 9928301411
>> >>> >
>> >>> > InfoObjects, Inc.
>> >>> > Execution Matters
>> >>> > http://www.infoobjects.com
>> >>> > 2041 Mission College Boulevard, #280
>> >>> > Santa Clara, CA 95054
>> >>> > +1 (408) 988-2000 Work
>> >>> > +1 (408) 716-2726 Fax
>> >>> >
>> >>> >
>> >>
>> >>
>> >>
>> >>
>> >> --
>> >> Regards,
>> >> Vikas Agarwal
>> >> 91 – 9928301411
>> >>
>> >> InfoObjects, Inc.
>> >> Execution Matters
>> >> http://www.infoobjects.com
>> >> 2041 Mission College Boulevard, #280
>> >> Santa Clara, CA 95054
>> >> +1 (408) 988-2000 Work
>> >> +1 (408) 716-2726 Fax
>> >
>> >
>> >
>> >
>> > --
>> > Regards,
>> > Vikas Agarwal
>> > 91 – 9928301411
>> >
>> > InfoObjects, Inc.
>> > Execution Matters
>> > http://www.infoobjects.com
>> > 2041 Mission College Boulevard, #280
>> > Santa Clara, CA 95054
>> > +1 (408) 988-2000 Work
>> > +1 (408) 716-2726 Fax
>>
>
>
>
> --
> Regards,
> Vikas Agarwal
> 91 – 9928301411
>
> InfoObjects, Inc.
> Execution Matters
> http://www.infoobjects.com
> 2041 Mission College Boulevard, #280
> Santa Clara, CA 95054
> +1 (408) 988-2000 Work
> +1 (408) 716-2726 Fax
>
>


-- 
Regards,
Vikas Agarwal
91 – 9928301411

InfoObjects, Inc.
Execution Matters
http://www.infoobjects.com
2041 Mission College Boulevard, #280
Santa Clara, CA 95054
+1 (408) 988-2000 Work
+1 (408) 716-2726 Fax

Re: Phoenix response time

Posted by Vikas Agarwal <vi...@infoobjects.com>.
Yes, that is why it is a trouble for me. However, on contrary, HBase shell
is also on the same machine and same environment, so if it is an issue of
resource (CPU or memory) it should have affected the HBase too, but HBase
is able to give me results within 0.0150 seconds. :(

No, I haven't tested it outside AWS. I guess, it should not be the case due
to much better performance by native HBase query on HBase shell.


On Sat, Sep 6, 2014 at 11:59 AM, James Taylor <ja...@apache.org>
wrote:

> Something is up in your environment. What version of Phoenix and HBase
> are you using and in what environment? Have you tried this locally,
> outside of AWS to compare?
>
> Take a look at our perf numbers, generated more-or-less daily, and
> which run over more data that what you're testing against:
> http://phoenix-bin.github.io/client/performance/phoenix-20140904095313.htm
>
> Some of these are point queries and they take in the neighborhood of
> 0.01 seconds.
>
> Thanks,
> James
>
> On Fri, Sep 5, 2014 at 10:48 PM, Vikas Agarwal <vi...@infoobjects.com>
> wrote:
> > Missed to mention that count query (posted in my last mail) is also
> taking
> > very long time to return the count.
> >
> >
> > On Sat, Sep 6, 2014 at 11:17 AM, Vikas Agarwal <vi...@infoobjects.com>
> > wrote:
> >>
> >> As I mentioned, schema is nothing but bunch of fields (some being
> >> integers, longs and text) along with primary key (row key) and I am
> making
> >> simple query to get result for a particular primary key, nothing more
> than
> >> that.
> >>
> >> 0: jdbc:phoenix:localhost> SELECT count(1) FROM table_name;
> >>
> >> +------------+
> >>
> >> |  COUNT(1)  |
> >>
> >> +------------+
> >>
> >> | 4667515    |
> >>
> >> +------------+
> >>
> >> 1 row selected (132.11 seconds)
> >>
> >>
> >>
> >> On Sat, Sep 6, 2014 at 11:09 AM, Puneet Kumar Ojha
> >> <pu...@pubmatic.com> wrote:
> >>>
> >>> If you can share the schema,data type,cardinality of each dimension and
> >>> usual queries, I can help to design a schema with performance of less
> than 1
> >>> sec using Phoenix.
> >>>
> >>>
> >>>
> >>> Thanks
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> ------ Original message------
> >>>
> >>> From: James Taylor
> >>>
> >>> Date: Sat, Sep 6, 2014 10:15 AM
> >>>
> >>> To: user;
> >>>
> >>> Subject:Re: Phoenix response time
> >>>
> >>>
> >>>
> >>> Vikas,
> >>> Please post your schema and query.
> >>> Thanks,
> >>> James
> >>>
> >>> On Fri, Sep 5, 2014 at 9:18 PM, Vikas Agarwal <vi...@infoobjects.com>
> >>> wrote:
> >>> > Ours is also a single node setup right now and as of now there are
> less
> >>> > than
> >>> > 1 million rows which is expected to grow around 100m at minimum.
> >>> >
> >>> > I am aware of secondary indexes but when I am querying on primary/row
> >>> > key,
> >>> > why would it take so much time?
> >>> >
> >>> > I am directly querying using sqlline for Phoenix and hbase shell for
> >>> > HBase
> >>> > query. I am not expecting to do any fine tuning for such small
> dataset.
> >>> > I am
> >>> > assumimg a minimum performance level out of the box.
> >>> >
> >>> > On Friday, September 5, 2014, yeshwanth kumar <yeshwanth43@gmail.com
> >
> >>> > wrote:
> >>> >>
> >>> >> hi vikas,
> >>> >>
> >>> >> we used phoenix on a 4 core/23Gb machine, as a single node setup.
> >>> >> used HDP 2.1
> >>> >> our table has 50-70M rows,
> >>> >> select on that table took less than 2 seconds.
> >>> >> Aggregation queries took less than 8 seconds.
> >>> >> for achieving good performance we created secondary index on the
> >>> >> table.
> >>> >>
> >>> >> make sure you finetuned hbase,
> >>> >> enabling compression on the data makes a difference in response.
> >>> >> if u distribute the data and load over all regions in hbase,
> >>> >> look at the performance tips mentioned in phoenix blog
> >>> >>
> >>> >> -yeshwanth
> >>> >>
> >>> >>
> >>> >>
> >>> >> Cheers,
> >>> >> Yeshwanth
> >>> >>
> >>> >>
> >>> >>
> >>> >> On Fri, Sep 5, 2014 at 5:42 PM, Vikas Agarwal <
> vikas@infoobjects.com>
> >>> >> wrote:
> >>> >>>
> >>> >>> Hi,
> >>> >>>
> >>> >>> Preface: We are testing phoenix using Hortonworks distribution for
> >>> >>> HBase
> >>> >>> on Amazon EC2 instance (r3.large, 2 CPU/15 GB RAM).
> >>> >>>
> >>> >>> With contrast to performance benchmarks, I found Phoenix to be very
> >>> >>> slow
> >>> >>> in querying even on primary key or row key. So, tried to increase
> the
> >>> >>> RAM
> >>> >>> for HBase and Phoenix and increasing the CPU and RAM by upgrading
> the
> >>> >>> EC2
> >>> >>> machine type to r3.xlarge (4 CPU, 30 GB RAM). Results were like
> this:
> >>> >>>
> >>> >>> Time takes in returning result of query on row key:
> >>> >>> With Storm running and very less RAM available: 50 sec
> >>> >>>
> >>> >>> With Storm stopped and RAM available to Phoenix and HBase: 18 sec
> >>> >>>
> >>> >>> With new machine of next higher category (4 CPU and 30 GB RAM): 8
> sec
> >>> >>>
> >>> >>> Pure HBase query by row key with Storm stopped and (2 CPU, 15 GB
> >>> >>> RAM):
> >>> >>> 0.0150 seconds. :)
> >>> >>>
> >>> >>> So, the difference seems to be many fold of what native HBase is
> >>> >>> providing to us. I am not able to understand how it can be
> possible?
> >>> >>> What I
> >>> >>> am missing here?
> >>> >>>
> >>> >>> --
> >>> >>> Regards,
> >>> >>> Vikas Agarwal
> >>> >>> 91 – 9928301411
> >>> >>>
> >>> >>> InfoObjects, Inc.
> >>> >>> Execution Matters
> >>> >>> http://www.infoobjects.com
> >>> >>> 2041 Mission College Boulevard, #280
> >>> >>> Santa Clara, CA 95054
> >>> >>> +1 (408) 988-2000 Work
> >>> >>> +1 (408) 716-2726 Fax
> >>> >>
> >>> >>
> >>> >
> >>> >
> >>> > --
> >>> > Regards,
> >>> > Vikas Agarwal
> >>> > 91 – 9928301411
> >>> >
> >>> > InfoObjects, Inc.
> >>> > Execution Matters
> >>> > http://www.infoobjects.com
> >>> > 2041 Mission College Boulevard, #280
> >>> > Santa Clara, CA 95054
> >>> > +1 (408) 988-2000 Work
> >>> > +1 (408) 716-2726 Fax
> >>> >
> >>> >
> >>
> >>
> >>
> >>
> >> --
> >> Regards,
> >> Vikas Agarwal
> >> 91 – 9928301411
> >>
> >> InfoObjects, Inc.
> >> Execution Matters
> >> http://www.infoobjects.com
> >> 2041 Mission College Boulevard, #280
> >> Santa Clara, CA 95054
> >> +1 (408) 988-2000 Work
> >> +1 (408) 716-2726 Fax
> >
> >
> >
> >
> > --
> > Regards,
> > Vikas Agarwal
> > 91 – 9928301411
> >
> > InfoObjects, Inc.
> > Execution Matters
> > http://www.infoobjects.com
> > 2041 Mission College Boulevard, #280
> > Santa Clara, CA 95054
> > +1 (408) 988-2000 Work
> > +1 (408) 716-2726 Fax
>



-- 
Regards,
Vikas Agarwal
91 – 9928301411

InfoObjects, Inc.
Execution Matters
http://www.infoobjects.com
2041 Mission College Boulevard, #280
Santa Clara, CA 95054
+1 (408) 988-2000 Work
+1 (408) 716-2726 Fax

Re: Phoenix response time

Posted by James Taylor <ja...@apache.org>.
Something is up in your environment. What version of Phoenix and HBase
are you using and in what environment? Have you tried this locally,
outside of AWS to compare?

Take a look at our perf numbers, generated more-or-less daily, and
which run over more data that what you're testing against:
http://phoenix-bin.github.io/client/performance/phoenix-20140904095313.htm

Some of these are point queries and they take in the neighborhood of
0.01 seconds.

Thanks,
James

On Fri, Sep 5, 2014 at 10:48 PM, Vikas Agarwal <vi...@infoobjects.com> wrote:
> Missed to mention that count query (posted in my last mail) is also taking
> very long time to return the count.
>
>
> On Sat, Sep 6, 2014 at 11:17 AM, Vikas Agarwal <vi...@infoobjects.com>
> wrote:
>>
>> As I mentioned, schema is nothing but bunch of fields (some being
>> integers, longs and text) along with primary key (row key) and I am making
>> simple query to get result for a particular primary key, nothing more than
>> that.
>>
>> 0: jdbc:phoenix:localhost> SELECT count(1) FROM table_name;
>>
>> +------------+
>>
>> |  COUNT(1)  |
>>
>> +------------+
>>
>> | 4667515    |
>>
>> +------------+
>>
>> 1 row selected (132.11 seconds)
>>
>>
>>
>> On Sat, Sep 6, 2014 at 11:09 AM, Puneet Kumar Ojha
>> <pu...@pubmatic.com> wrote:
>>>
>>> If you can share the schema,data type,cardinality of each dimension and
>>> usual queries, I can help to design a schema with performance of less than 1
>>> sec using Phoenix.
>>>
>>>
>>>
>>> Thanks
>>>
>>>
>>>
>>>
>>>
>>> ------ Original message------
>>>
>>> From: James Taylor
>>>
>>> Date: Sat, Sep 6, 2014 10:15 AM
>>>
>>> To: user;
>>>
>>> Subject:Re: Phoenix response time
>>>
>>>
>>>
>>> Vikas,
>>> Please post your schema and query.
>>> Thanks,
>>> James
>>>
>>> On Fri, Sep 5, 2014 at 9:18 PM, Vikas Agarwal <vi...@infoobjects.com>
>>> wrote:
>>> > Ours is also a single node setup right now and as of now there are less
>>> > than
>>> > 1 million rows which is expected to grow around 100m at minimum.
>>> >
>>> > I am aware of secondary indexes but when I am querying on primary/row
>>> > key,
>>> > why would it take so much time?
>>> >
>>> > I am directly querying using sqlline for Phoenix and hbase shell for
>>> > HBase
>>> > query. I am not expecting to do any fine tuning for such small dataset.
>>> > I am
>>> > assumimg a minimum performance level out of the box.
>>> >
>>> > On Friday, September 5, 2014, yeshwanth kumar <ye...@gmail.com>
>>> > wrote:
>>> >>
>>> >> hi vikas,
>>> >>
>>> >> we used phoenix on a 4 core/23Gb machine, as a single node setup.
>>> >> used HDP 2.1
>>> >> our table has 50-70M rows,
>>> >> select on that table took less than 2 seconds.
>>> >> Aggregation queries took less than 8 seconds.
>>> >> for achieving good performance we created secondary index on the
>>> >> table.
>>> >>
>>> >> make sure you finetuned hbase,
>>> >> enabling compression on the data makes a difference in response.
>>> >> if u distribute the data and load over all regions in hbase,
>>> >> look at the performance tips mentioned in phoenix blog
>>> >>
>>> >> -yeshwanth
>>> >>
>>> >>
>>> >>
>>> >> Cheers,
>>> >> Yeshwanth
>>> >>
>>> >>
>>> >>
>>> >> On Fri, Sep 5, 2014 at 5:42 PM, Vikas Agarwal <vi...@infoobjects.com>
>>> >> wrote:
>>> >>>
>>> >>> Hi,
>>> >>>
>>> >>> Preface: We are testing phoenix using Hortonworks distribution for
>>> >>> HBase
>>> >>> on Amazon EC2 instance (r3.large, 2 CPU/15 GB RAM).
>>> >>>
>>> >>> With contrast to performance benchmarks, I found Phoenix to be very
>>> >>> slow
>>> >>> in querying even on primary key or row key. So, tried to increase the
>>> >>> RAM
>>> >>> for HBase and Phoenix and increasing the CPU and RAM by upgrading the
>>> >>> EC2
>>> >>> machine type to r3.xlarge (4 CPU, 30 GB RAM). Results were like this:
>>> >>>
>>> >>> Time takes in returning result of query on row key:
>>> >>> With Storm running and very less RAM available: 50 sec
>>> >>>
>>> >>> With Storm stopped and RAM available to Phoenix and HBase: 18 sec
>>> >>>
>>> >>> With new machine of next higher category (4 CPU and 30 GB RAM): 8 sec
>>> >>>
>>> >>> Pure HBase query by row key with Storm stopped and (2 CPU, 15 GB
>>> >>> RAM):
>>> >>> 0.0150 seconds. :)
>>> >>>
>>> >>> So, the difference seems to be many fold of what native HBase is
>>> >>> providing to us. I am not able to understand how it can be possible?
>>> >>> What I
>>> >>> am missing here?
>>> >>>
>>> >>> --
>>> >>> Regards,
>>> >>> Vikas Agarwal
>>> >>> 91 – 9928301411
>>> >>>
>>> >>> InfoObjects, Inc.
>>> >>> Execution Matters
>>> >>> http://www.infoobjects.com
>>> >>> 2041 Mission College Boulevard, #280
>>> >>> Santa Clara, CA 95054
>>> >>> +1 (408) 988-2000 Work
>>> >>> +1 (408) 716-2726 Fax
>>> >>
>>> >>
>>> >
>>> >
>>> > --
>>> > Regards,
>>> > Vikas Agarwal
>>> > 91 – 9928301411
>>> >
>>> > InfoObjects, Inc.
>>> > Execution Matters
>>> > http://www.infoobjects.com
>>> > 2041 Mission College Boulevard, #280
>>> > Santa Clara, CA 95054
>>> > +1 (408) 988-2000 Work
>>> > +1 (408) 716-2726 Fax
>>> >
>>> >
>>
>>
>>
>>
>> --
>> Regards,
>> Vikas Agarwal
>> 91 – 9928301411
>>
>> InfoObjects, Inc.
>> Execution Matters
>> http://www.infoobjects.com
>> 2041 Mission College Boulevard, #280
>> Santa Clara, CA 95054
>> +1 (408) 988-2000 Work
>> +1 (408) 716-2726 Fax
>
>
>
>
> --
> Regards,
> Vikas Agarwal
> 91 – 9928301411
>
> InfoObjects, Inc.
> Execution Matters
> http://www.infoobjects.com
> 2041 Mission College Boulevard, #280
> Santa Clara, CA 95054
> +1 (408) 988-2000 Work
> +1 (408) 716-2726 Fax

Re: Phoenix response time

Posted by Vikas Agarwal <vi...@infoobjects.com>.
Missed to mention that count query (posted in my last mail) is also taking
very long time to return the count.


On Sat, Sep 6, 2014 at 11:17 AM, Vikas Agarwal <vi...@infoobjects.com>
wrote:

> As I mentioned, schema is nothing but bunch of fields (some being
> integers, longs and text) along with primary key (row key) and I am making
> simple query to get result for a particular primary key, nothing more than
> that.
>
> 0: jdbc:phoenix:localhost> SELECT count(1) FROM table_name;
>
> +------------+
>
> |  COUNT(1)  |
>
> +------------+
>
> | 4667515    |
>
> +------------+
>
> 1 row selected (*132.11 seconds*)
>
>
> On Sat, Sep 6, 2014 at 11:09 AM, Puneet Kumar Ojha <
> puneet.kumar@pubmatic.com> wrote:
>
>>   If you can share the schema,data type,cardinality of each dimension
>> and usual queries, I can help to design a schema with performance of less
>> than 1 sec using Phoenix.
>>
>>
>>
>> Thanks
>>
>>
>>
>>
>>
>> ------ Original message------
>>
>> *From: *James Taylor
>>
>> *Date: *Sat, Sep 6, 2014 10:15 AM
>>
>> *To: *user;
>>
>> *Subject:*Re: Phoenix response time
>>
>>
>>   Vikas,
>> Please post your schema and query.
>> Thanks,
>> James
>>
>> On Fri, Sep 5, 2014 at 9:18 PM, Vikas Agarwal <vi...@infoobjects.com>
>> wrote:
>> > Ours is also a single node setup right now and as of now there are less
>> than
>> > 1 million rows which is expected to grow around 100m at minimum.
>> >
>> > I am aware of secondary indexes but when I am querying on primary/row
>> key,
>> > why would it take so much time?
>> >
>> > I am directly querying using sqlline for Phoenix and hbase shell for
>> HBase
>> > query. I am not expecting to do any fine tuning for such small dataset.
>> I am
>> > assumimg a minimum performance level out of the box.
>> >
>> > On Friday, September 5, 2014, yeshwanth kumar <ye...@gmail.com>
>> wrote:
>> >>
>> >> hi vikas,
>> >>
>> >> we used phoenix on a 4 core/23Gb machine, as a single node setup.
>> >> used HDP 2.1
>> >> our table has 50-70M rows,
>> >> select on that table took less than 2 seconds.
>> >> Aggregation queries took less than 8 seconds.
>> >> for achieving good performance we created secondary index on the table.
>> >>
>> >> make sure you finetuned hbase,
>> >> enabling compression on the data makes a difference in response.
>> >> if u distribute the data and load over all regions in hbase,
>> >> look at the performance tips mentioned in phoenix blog
>> >>
>> >> -yeshwanth
>> >>
>> >>
>> >>
>> >> Cheers,
>> >> Yeshwanth
>> >>
>> >>
>> >>
>> >> On Fri, Sep 5, 2014 at 5:42 PM, Vikas Agarwal <vi...@infoobjects.com>
>> >> wrote:
>> >>>
>> >>> Hi,
>> >>>
>> >>> Preface: We are testing phoenix using Hortonworks distribution for
>> HBase
>> >>> on Amazon EC2 instance (r3.large, 2 CPU/15 GB RAM).
>> >>>
>> >>> With contrast to performance benchmarks, I found Phoenix to be very
>> slow
>> >>> in querying even on primary key or row key. So, tried to increase the
>> RAM
>> >>> for HBase and Phoenix and increasing the CPU and RAM by upgrading the
>> EC2
>> >>> machine type to r3.xlarge (4 CPU, 30 GB RAM). Results were like this:
>> >>>
>> >>> Time takes in returning result of query on row key:
>> >>> With Storm running and very less RAM available: 50 sec
>> >>>
>> >>> With Storm stopped and RAM available to Phoenix and HBase: 18 sec
>> >>>
>> >>> With new machine of next higher category (4 CPU and 30 GB RAM): 8 sec
>> >>>
>> >>> Pure HBase query by row key with Storm stopped and (2 CPU, 15 GB RAM):
>> >>> 0.0150 seconds. :)
>> >>>
>> >>> So, the difference seems to be many fold of what native HBase is
>> >>> providing to us. I am not able to understand how it can be possible?
>> What I
>> >>> am missing here?
>> >>>
>> >>> --
>> >>> Regards,
>> >>> Vikas Agarwal
>> >>> 91 – 9928301411
>> >>>
>> >>> InfoObjects, Inc.
>> >>> Execution Matters
>> >>> http://www.infoobjects.com
>> >>> 2041 Mission College Boulevard, #280
>> >>> Santa Clara, CA 95054
>> >>> +1 (408) 988-2000 Work
>> >>> +1 (408) 716-2726 Fax
>> >>
>> >>
>> >
>> >
>> > --
>> > Regards,
>> > Vikas Agarwal
>> > 91 – 9928301411
>> >
>> > InfoObjects, Inc.
>> > Execution Matters
>> > http://www.infoobjects.com
>> > 2041 Mission College Boulevard, #280
>> > Santa Clara, CA 95054
>> > +1 (408) 988-2000 Work
>> > +1 (408) 716-2726 Fax
>> >
>> >
>>
>
>
>
> --
> Regards,
> Vikas Agarwal
> 91 – 9928301411
>
> InfoObjects, Inc.
> Execution Matters
> http://www.infoobjects.com
> 2041 Mission College Boulevard, #280
> Santa Clara, CA 95054
> +1 (408) 988-2000 Work
> +1 (408) 716-2726 Fax
>
>


-- 
Regards,
Vikas Agarwal
91 – 9928301411

InfoObjects, Inc.
Execution Matters
http://www.infoobjects.com
2041 Mission College Boulevard, #280
Santa Clara, CA 95054
+1 (408) 988-2000 Work
+1 (408) 716-2726 Fax

Re: Phoenix response time

Posted by Vikas Agarwal <vi...@infoobjects.com>.
As I mentioned, schema is nothing but bunch of fields (some being integers,
longs and text) along with primary key (row key) and I am making simple
query to get result for a particular primary key, nothing more than that.

0: jdbc:phoenix:localhost> SELECT count(1) FROM table_name;

+------------+

|  COUNT(1)  |

+------------+

| 4667515    |

+------------+

1 row selected (*132.11 seconds*)


On Sat, Sep 6, 2014 at 11:09 AM, Puneet Kumar Ojha <
puneet.kumar@pubmatic.com> wrote:

>   If you can share the schema,data type,cardinality of each dimension and
> usual queries, I can help to design a schema with performance of less than
> 1 sec using Phoenix.
>
>
>
> Thanks
>
>
>
>
>
> ------ Original message------
>
> *From: *James Taylor
>
> *Date: *Sat, Sep 6, 2014 10:15 AM
>
> *To: *user;
>
> *Subject:*Re: Phoenix response time
>
>
>   Vikas,
> Please post your schema and query.
> Thanks,
> James
>
> On Fri, Sep 5, 2014 at 9:18 PM, Vikas Agarwal <vi...@infoobjects.com>
> wrote:
> > Ours is also a single node setup right now and as of now there are less
> than
> > 1 million rows which is expected to grow around 100m at minimum.
> >
> > I am aware of secondary indexes but when I am querying on primary/row
> key,
> > why would it take so much time?
> >
> > I am directly querying using sqlline for Phoenix and hbase shell for
> HBase
> > query. I am not expecting to do any fine tuning for such small dataset.
> I am
> > assumimg a minimum performance level out of the box.
> >
> > On Friday, September 5, 2014, yeshwanth kumar <ye...@gmail.com>
> wrote:
> >>
> >> hi vikas,
> >>
> >> we used phoenix on a 4 core/23Gb machine, as a single node setup.
> >> used HDP 2.1
> >> our table has 50-70M rows,
> >> select on that table took less than 2 seconds.
> >> Aggregation queries took less than 8 seconds.
> >> for achieving good performance we created secondary index on the table.
> >>
> >> make sure you finetuned hbase,
> >> enabling compression on the data makes a difference in response.
> >> if u distribute the data and load over all regions in hbase,
> >> look at the performance tips mentioned in phoenix blog
> >>
> >> -yeshwanth
> >>
> >>
> >>
> >> Cheers,
> >> Yeshwanth
> >>
> >>
> >>
> >> On Fri, Sep 5, 2014 at 5:42 PM, Vikas Agarwal <vi...@infoobjects.com>
> >> wrote:
> >>>
> >>> Hi,
> >>>
> >>> Preface: We are testing phoenix using Hortonworks distribution for
> HBase
> >>> on Amazon EC2 instance (r3.large, 2 CPU/15 GB RAM).
> >>>
> >>> With contrast to performance benchmarks, I found Phoenix to be very
> slow
> >>> in querying even on primary key or row key. So, tried to increase the
> RAM
> >>> for HBase and Phoenix and increasing the CPU and RAM by upgrading the
> EC2
> >>> machine type to r3.xlarge (4 CPU, 30 GB RAM). Results were like this:
> >>>
> >>> Time takes in returning result of query on row key:
> >>> With Storm running and very less RAM available: 50 sec
> >>>
> >>> With Storm stopped and RAM available to Phoenix and HBase: 18 sec
> >>>
> >>> With new machine of next higher category (4 CPU and 30 GB RAM): 8 sec
> >>>
> >>> Pure HBase query by row key with Storm stopped and (2 CPU, 15 GB RAM):
> >>> 0.0150 seconds. :)
> >>>
> >>> So, the difference seems to be many fold of what native HBase is
> >>> providing to us. I am not able to understand how it can be possible?
> What I
> >>> am missing here?
> >>>
> >>> --
> >>> Regards,
> >>> Vikas Agarwal
> >>> 91 – 9928301411
> >>>
> >>> InfoObjects, Inc.
> >>> Execution Matters
> >>> http://www.infoobjects.com
> >>> 2041 Mission College Boulevard, #280
> >>> Santa Clara, CA 95054
> >>> +1 (408) 988-2000 Work
> >>> +1 (408) 716-2726 Fax
> >>
> >>
> >
> >
> > --
> > Regards,
> > Vikas Agarwal
> > 91 – 9928301411
> >
> > InfoObjects, Inc.
> > Execution Matters
> > http://www.infoobjects.com
> > 2041 Mission College Boulevard, #280
> > Santa Clara, CA 95054
> > +1 (408) 988-2000 Work
> > +1 (408) 716-2726 Fax
> >
> >
>



-- 
Regards,
Vikas Agarwal
91 – 9928301411

InfoObjects, Inc.
Execution Matters
http://www.infoobjects.com
2041 Mission College Boulevard, #280
Santa Clara, CA 95054
+1 (408) 988-2000 Work
+1 (408) 716-2726 Fax

Re: Phoenix response time

Posted by Puneet Kumar Ojha <pu...@pubmatic.com>.
If you can share the schema,data type,cardinality of each dimension and usual queries, I can help to design a schema with performance of less than 1 sec using Phoenix.



Thanks





------ Original message------

From: James Taylor

Date: Sat, Sep 6, 2014 10:15 AM

To: user;

Subject:Re: Phoenix response time



Vikas,
Please post your schema and query.
Thanks,
James

On Fri, Sep 5, 2014 at 9:18 PM, Vikas Agarwal <vi...@infoobjects.com> wrote:
> Ours is also a single node setup right now and as of now there are less than
> 1 million rows which is expected to grow around 100m at minimum.
>
> I am aware of secondary indexes but when I am querying on primary/row key,
> why would it take so much time?
>
> I am directly querying using sqlline for Phoenix and hbase shell for HBase
> query. I am not expecting to do any fine tuning for such small dataset. I am
> assumimg a minimum performance level out of the box.
>
> On Friday, September 5, 2014, yeshwanth kumar <ye...@gmail.com> wrote:
>>
>> hi vikas,
>>
>> we used phoenix on a 4 core/23Gb machine, as a single node setup.
>> used HDP 2.1
>> our table has 50-70M rows,
>> select on that table took less than 2 seconds.
>> Aggregation queries took less than 8 seconds.
>> for achieving good performance we created secondary index on the table.
>>
>> make sure you finetuned hbase,
>> enabling compression on the data makes a difference in response.
>> if u distribute the data and load over all regions in hbase,
>> look at the performance tips mentioned in phoenix blog
>>
>> -yeshwanth
>>
>>
>>
>> Cheers,
>> Yeshwanth
>>
>>
>>
>> On Fri, Sep 5, 2014 at 5:42 PM, Vikas Agarwal <vi...@infoobjects.com>
>> wrote:
>>>
>>> Hi,
>>>
>>> Preface: We are testing phoenix using Hortonworks distribution for HBase
>>> on Amazon EC2 instance (r3.large, 2 CPU/15 GB RAM).
>>>
>>> With contrast to performance benchmarks, I found Phoenix to be very slow
>>> in querying even on primary key or row key. So, tried to increase the RAM
>>> for HBase and Phoenix and increasing the CPU and RAM by upgrading the EC2
>>> machine type to r3.xlarge (4 CPU, 30 GB RAM). Results were like this:
>>>
>>> Time takes in returning result of query on row key:
>>> With Storm running and very less RAM available: 50 sec
>>>
>>> With Storm stopped and RAM available to Phoenix and HBase: 18 sec
>>>
>>> With new machine of next higher category (4 CPU and 30 GB RAM): 8 sec
>>>
>>> Pure HBase query by row key with Storm stopped and (2 CPU, 15 GB RAM):
>>> 0.0150 seconds. :)
>>>
>>> So, the difference seems to be many fold of what native HBase is
>>> providing to us. I am not able to understand how it can be possible? What I
>>> am missing here?
>>>
>>> --
>>> Regards,
>>> Vikas Agarwal
>>> 91 – 9928301411
>>>
>>> InfoObjects, Inc.
>>> Execution Matters
>>> http://www.infoobjects.com
>>> 2041 Mission College Boulevard, #280
>>> Santa Clara, CA 95054
>>> +1 (408) 988-2000 Work
>>> +1 (408) 716-2726 Fax
>>
>>
>
>
> --
> Regards,
> Vikas Agarwal
> 91 – 9928301411
>
> InfoObjects, Inc.
> Execution Matters
> http://www.infoobjects.com
> 2041 Mission College Boulevard, #280
> Santa Clara, CA 95054
> +1 (408) 988-2000 Work
> +1 (408) 716-2726 Fax
>
>

Re: Phoenix response time

Posted by James Taylor <ja...@apache.org>.
Vikas,
Please post your schema and query.
Thanks,
James

On Fri, Sep 5, 2014 at 9:18 PM, Vikas Agarwal <vi...@infoobjects.com> wrote:
> Ours is also a single node setup right now and as of now there are less than
> 1 million rows which is expected to grow around 100m at minimum.
>
> I am aware of secondary indexes but when I am querying on primary/row key,
> why would it take so much time?
>
> I am directly querying using sqlline for Phoenix and hbase shell for HBase
> query. I am not expecting to do any fine tuning for such small dataset. I am
> assumimg a minimum performance level out of the box.
>
> On Friday, September 5, 2014, yeshwanth kumar <ye...@gmail.com> wrote:
>>
>> hi vikas,
>>
>> we used phoenix on a 4 core/23Gb machine, as a single node setup.
>> used HDP 2.1
>> our table has 50-70M rows,
>> select on that table took less than 2 seconds.
>> Aggregation queries took less than 8 seconds.
>> for achieving good performance we created secondary index on the table.
>>
>> make sure you finetuned hbase,
>> enabling compression on the data makes a difference in response.
>> if u distribute the data and load over all regions in hbase,
>> look at the performance tips mentioned in phoenix blog
>>
>> -yeshwanth
>>
>>
>>
>> Cheers,
>> Yeshwanth
>>
>>
>>
>> On Fri, Sep 5, 2014 at 5:42 PM, Vikas Agarwal <vi...@infoobjects.com>
>> wrote:
>>>
>>> Hi,
>>>
>>> Preface: We are testing phoenix using Hortonworks distribution for HBase
>>> on Amazon EC2 instance (r3.large, 2 CPU/15 GB RAM).
>>>
>>> With contrast to performance benchmarks, I found Phoenix to be very slow
>>> in querying even on primary key or row key. So, tried to increase the RAM
>>> for HBase and Phoenix and increasing the CPU and RAM by upgrading the EC2
>>> machine type to r3.xlarge (4 CPU, 30 GB RAM). Results were like this:
>>>
>>> Time takes in returning result of query on row key:
>>> With Storm running and very less RAM available: 50 sec
>>>
>>> With Storm stopped and RAM available to Phoenix and HBase: 18 sec
>>>
>>> With new machine of next higher category (4 CPU and 30 GB RAM): 8 sec
>>>
>>> Pure HBase query by row key with Storm stopped and (2 CPU, 15 GB RAM):
>>> 0.0150 seconds. :)
>>>
>>> So, the difference seems to be many fold of what native HBase is
>>> providing to us. I am not able to understand how it can be possible? What I
>>> am missing here?
>>>
>>> --
>>> Regards,
>>> Vikas Agarwal
>>> 91 – 9928301411
>>>
>>> InfoObjects, Inc.
>>> Execution Matters
>>> http://www.infoobjects.com
>>> 2041 Mission College Boulevard, #280
>>> Santa Clara, CA 95054
>>> +1 (408) 988-2000 Work
>>> +1 (408) 716-2726 Fax
>>
>>
>
>
> --
> Regards,
> Vikas Agarwal
> 91 – 9928301411
>
> InfoObjects, Inc.
> Execution Matters
> http://www.infoobjects.com
> 2041 Mission College Boulevard, #280
> Santa Clara, CA 95054
> +1 (408) 988-2000 Work
> +1 (408) 716-2726 Fax
>
>

Re: Phoenix response time

Posted by Vikas Agarwal <vi...@infoobjects.com>.
Ours is also a single node setup right now and as of now there are less
than 1 million rows which is expected to grow around 100m at minimum.

I am aware of secondary indexes but when I am querying on primary/row key,
why would it take so much time?

I am directly querying using sqlline for Phoenix and hbase shell for HBase
query. I am not expecting to do any fine tuning for such small dataset. I
am assumimg a minimum performance level out of the box.

On Friday, September 5, 2014, yeshwanth kumar <ye...@gmail.com> wrote:

> hi vikas,
>
> we used phoenix on a 4 core/23Gb machine, as a single node setup.
> used HDP 2.1
> our table has 50-70M rows,
> select on that table took less than 2 seconds.
> Aggregation queries took less than 8 seconds.
> for achieving good performance we created secondary index on the table.
>
> make sure you finetuned hbase,
> enabling compression on the data makes a difference in response.
> if u distribute the data and load over all regions in hbase,
> look at the performance tips mentioned in phoenix blog
>
> -yeshwanth
>
>
>
> Cheers,
> Yeshwanth
>
>
>
> On Fri, Sep 5, 2014 at 5:42 PM, Vikas Agarwal <vikas@infoobjects.com
> <javascript:_e(%7B%7D,'cvml','vikas@infoobjects.com');>> wrote:
>
>> Hi,
>>
>> Preface: We are testing phoenix using Hortonworks distribution for HBase
>> on Amazon EC2 instance (r3.large <http://aws.amazon.com/ec2/pricing/>, 2
>> CPU/15 GB RAM).
>>
>> With contrast to performance benchmarks
>> <http://phoenix.apache.org/performance.html>, I found Phoenix to be very
>> slow in querying even on primary key or row key. So, tried to increase the
>> RAM for HBase and Phoenix and increasing the CPU and RAM by upgrading the
>> EC2 machine type to r3.xlarge (4 CPU, 30 GB RAM). Results were like this:
>>
>> Time takes in returning result of query on row key:
>> With Storm running and very less RAM available: *50 sec*
>>
>> With Storm stopped and RAM available to Phoenix and HBase: *18 sec*
>>
>> With new machine of next higher category (4 CPU and 30 GB RAM): *8 sec*
>>
>> Pure HBase query by row key with Storm stopped and (2 CPU, 15 GB RAM): *0.0150
>> seconds*. :)
>>
>> So, the difference seems to be many fold of what native HBase is
>> providing to us. I am not able to understand how it can be possible? What I
>> am missing here?
>>
>> --
>> Regards,
>> Vikas Agarwal
>> 91 – 9928301411
>>
>> InfoObjects, Inc.
>> Execution Matters
>> http://www.infoobjects.com
>> 2041 Mission College Boulevard, #280
>> Santa Clara, CA 95054
>> +1 (408) 988-2000 Work
>> +1 (408) 716-2726 Fax
>>
>>
>

-- 
Regards,
Vikas Agarwal
91 – 9928301411

InfoObjects, Inc.
Execution Matters
http://www.infoobjects.com
2041 Mission College Boulevard, #280
Santa Clara, CA 95054
+1 (408) 988-2000 Work
+1 (408) 716-2726 Fax

Re: Phoenix response time

Posted by yeshwanth kumar <ye...@gmail.com>.
hi vikas,

we used phoenix on a 4 core/23Gb machine, as a single node setup.
used HDP 2.1
our table has 50-70M rows,
select on that table took less than 2 seconds.
Aggregation queries took less than 8 seconds.
for achieving good performance we created secondary index on the table.

make sure you finetuned hbase,
enabling compression on the data makes a difference in response.
if u distribute the data and load over all regions in hbase,
look at the performance tips mentioned in phoenix blog

-yeshwanth



Cheers,
Yeshwanth



On Fri, Sep 5, 2014 at 5:42 PM, Vikas Agarwal <vi...@infoobjects.com> wrote:

> Hi,
>
> Preface: We are testing phoenix using Hortonworks distribution for HBase
> on Amazon EC2 instance (r3.large <http://aws.amazon.com/ec2/pricing/>, 2
> CPU/15 GB RAM).
>
> With contrast to performance benchmarks
> <http://phoenix.apache.org/performance.html>, I found Phoenix to be very
> slow in querying even on primary key or row key. So, tried to increase the
> RAM for HBase and Phoenix and increasing the CPU and RAM by upgrading the
> EC2 machine type to r3.xlarge (4 CPU, 30 GB RAM). Results were like this:
>
> Time takes in returning result of query on row key:
> With Storm running and very less RAM available: *50 sec*
>
> With Storm stopped and RAM available to Phoenix and HBase: *18 sec*
>
> With new machine of next higher category (4 CPU and 30 GB RAM): *8 sec*
>
> Pure HBase query by row key with Storm stopped and (2 CPU, 15 GB RAM): *0.0150
> seconds*. :)
>
> So, the difference seems to be many fold of what native HBase is providing
> to us. I am not able to understand how it can be possible? What I am
> missing here?
>
> --
> Regards,
> Vikas Agarwal
> 91 – 9928301411
>
> InfoObjects, Inc.
> Execution Matters
> http://www.infoobjects.com
> 2041 Mission College Boulevard, #280
> Santa Clara, CA 95054
> +1 (408) 988-2000 Work
> +1 (408) 716-2726 Fax
>
>