You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@phoenix.apache.org by Sumanta Gh <su...@tcs.com> on 2015/10/13 09:00:42 UTC

Drop in throughput

Hi,
I was experimenting the reasons why there is significant drop in through-put when I give mixed workload of 70% Upsert + 30% Selects over 100% Upserts or 100% Selects.

While doing that I find that the following line is getting printed every time in my log :-

"Re-resolved stale table MY_TABLE with seqNum 0 at timestamp 1443618964162 with 22 columns: [...]"

Can it be avoided for each SQL query.

regards,
Sumanta

=====-----=====-----=====
Notice: The information contained in this e-mail
message and/or attachments to it may contain 
confidential or privileged information. If you are 
not the intended recipient, any dissemination, use, 
review, distribution, printing or copying of the 
information contained in this e-mail message 
and/or attachments to it are strictly prohibited. If 
you have received this communication in error, 
please notify us by reply e-mail or telephone and 
immediately and permanently delete the message 
and any attachments. Thank you

Re: Drop in throughput

Posted by James Taylor <ja...@apache.org>.

Sumanta,
Hard to say what the cause is for the drop in throughput without more
information. It's not uncommon that a mixed workload is going to have some
impact in throughput, though. For example, HBase will be doing splits and
compactions which will impact scan performance. Also, you'll have multiple
HFiles that will need to be merged at read time as opposed to having a
single HFile. Have you profiled the system to see if anything stands out?

For the line you're seeing in your logs, there's another JIRA
(PHOENIX-1270) which I plan to implement that will prevent the pinging of
the server that checks whether a schema is up-to-date. The trade off will
be that schema changes to a table will not be seen until a cluster bounce.

Thanks,
James

On Tue, Oct 13, 2015 at 11:28 AM, Thomas D'Silva <td...@salesforce.com>
wrote:

> Sumanta,
>
> Phoenix resolves the table for every SELECT. For UPSERT it resolves
> the table once at commit time.
> We have a JIRA in the txn branch where if you specify an SCN it will
> cache the table and look it up from the cache
> https://issues.apache.org/jira/browse/PHOENIX-1812
> This will be available once we merge the txn branch back.
>
> Thanks,
> Thomas
>
> On Tue, Oct 13, 2015 at 12:00 AM, Sumanta Gh <su...@tcs.com> wrote:
> > Hi,
> > I was experimenting the reasons why there is significant drop in
> through-put
> > when I give mixed workload of 70% Upsert + 30% Selects over 100% Upserts
> or
> > 100% Selects.
> >
> > While doing that I find that the following line is getting printed every
> > time in my log :-
> >
> > "Re-resolved stale table MY_TABLE with seqNum 0 at timestamp
> 1443618964162
> > with 22 columns: [...]"
> >
> > Can it be avoided for each SQL query.
> >
> > regards,
> > Sumanta
> >
> > =====-----=====-----=====
> > Notice: The information contained in this e-mail
> > message and/or attachments to it may contain
> > confidential or privileged information. If you are
> > not the intended recipient, any dissemination, use,
> > review, distribution, printing or copying of the
> > information contained in this e-mail message
> > and/or attachments to it are strictly prohibited. If
> > you have received this communication in error,
> > please notify us by reply e-mail or telephone and
> > immediately and permanently delete the message
> > and any attachments. Thank you
>

Re: Drop in throughput

Posted by Thomas D'Silva <td...@salesforce.com>.

Sumanta,

Phoenix resolves the table for every SELECT. For UPSERT it resolves
the table once at commit time.
We have a JIRA in the txn branch where if you specify an SCN it will
cache the table and look it up from the cache
https://issues.apache.org/jira/browse/PHOENIX-1812
This will be available once we merge the txn branch back.

Thanks,
Thomas

On Tue, Oct 13, 2015 at 12:00 AM, Sumanta Gh <su...@tcs.com> wrote:
> Hi,
> I was experimenting the reasons why there is significant drop in through-put
> when I give mixed workload of 70% Upsert + 30% Selects over 100% Upserts or
> 100% Selects.
>
> While doing that I find that the following line is getting printed every
> time in my log :-
>
> "Re-resolved stale table MY_TABLE with seqNum 0 at timestamp 1443618964162
> with 22 columns: [...]"
>
> Can it be avoided for each SQL query.
>
> regards,
> Sumanta
>
> =====-----=====-----=====
> Notice: The information contained in this e-mail
> message and/or attachments to it may contain
> confidential or privileged information. If you are
> not the intended recipient, any dissemination, use,
> review, distribution, printing or copying of the
> information contained in this e-mail message
> and/or attachments to it are strictly prohibited. If
> you have received this communication in error,
> please notify us by reply e-mail or telephone and
> immediately and permanently delete the message
> and any attachments. Thank you