You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@phoenix.apache.org by Kristoffer Sjögren <st...@gmail.com> on 2014/03/28 16:46:11 UTC

Slow delete

Hi

I was doing some data migration today on phoenix and noticed that delete
... where statements are _really_ slow (around 1 second per row).

Is this a known issue? Our installation works fine for inserts and selects.

Cheers,
-Kristoffer

Re: Slow delete

Posted by James Taylor <ja...@apache.org>.
Use the row value constructor syntax for this:

$ delete from t where (a,b) in ((1,1), (1,2),(1,3));

Thanks,
James


On Tue, Apr 1, 2014 at 12:13 AM, Kristoffer Sjögren <st...@gmail.com>wrote:

> Sorry for late reply.
>
> Here's the table
>
> $ create table t (a INTEGER not null, b INTEGER not null constraint pk
> primary key(a,b));
> $ upsert into t values (1,1);
>
> I did single delete statement / transaction for a composite primary key.
> Also tried to batch multiple delete statements / transaction.
>
> $ delete from t where a = 1 and b = 1;
>
> Can you show me an example of a IN ... WHERE DELETE statement? The
> following does not seem to work?
>
> jdbc:phoenix:localhost> delete from t where a in (select a from t);
> Error: com.salesforce.phoenix.parse.InParseNode[A,
> com.salesforce.phoenix.parse.SubqueryParseNode@4b96fac5] (state=,code=0)
>
> jdbc:phoenix:localhost> delete from t where (a,b) in (select a,b from t);
> Error:
> com.salesforce.phoenix.parse.InParseNode[com.salesforce.phoenix.parse.RowValueConstructorParseNode[A,
> B], com.salesforce.phoenix.parse.SubqueryParseNode@51436be2]
> (state=,code=0)
>
>
> The following statement works, but the row is never deleted? Bug?
>
> $ delete from t where (a,b) in (1,1);
>
> The following statement does in fact work and also delete the row.
>
> $ delete from t where (a) in (1);
>
>
>
>
>
>
>
> On Fri, Mar 28, 2014 at 6:19 PM, James Taylor <ja...@apache.org>wrote:
>
>> Hi Kristoffer,
>> No, this isn't a known or normal issue. What version of Phoenix and HBase
>> are you using? What does your DELETE statement look like? Perf-wise, it's
>> best if you can:
>> * Batch deletes (any mutation, really) to cut down on RPC traffic. The
>> phoenix.mutate.batchSize config parameter controls that and defaults to
>> 1000.
>> * Run as few DELETE statements as possible. For example, if you're
>> deleting by PK, then form a single DELETE statement with an IN clause in
>> the WHERE clause.
>> * Set auto commit on for your connection: conn.setAutoCommit(true)
>> assuming you don't have to issue a DELETE statement for each row separately.
>> * If you do have to issue separate DELETE statement for each row, then
>> keep auto commit as false and instead collect up batches of 1000 before
>> issuing a commit.
>>
>> If you're doing all that, and it's still slow, check your logs for
>> exceptions and perhaps try issuing a delete from the HBase shell against
>> your table as a test.
>>
>> Thanks,
>> James
>>
>>
>> On Fri, Mar 28, 2014 at 8:46 AM, Kristoffer Sjögren <st...@gmail.com>wrote:
>>
>>> Hi
>>>
>>> I was doing some data migration today on phoenix and noticed that delete
>>> ... where statements are _really_ slow (around 1 second per row).
>>>
>>> Is this a known issue? Our installation works fine for inserts and
>>> selects.
>>>
>>> Cheers,
>>> -Kristoffer
>>>
>>
>>
>

Re: Slow delete

Posted by Kristoffer Sjögren <st...@gmail.com>.
Sorry for late reply.

Here's the table

$ create table t (a INTEGER not null, b INTEGER not null constraint pk
primary key(a,b));
$ upsert into t values (1,1);

I did single delete statement / transaction for a composite primary key.
Also tried to batch multiple delete statements / transaction.

$ delete from t where a = 1 and b = 1;

Can you show me an example of a IN ... WHERE DELETE statement? The
following does not seem to work?

jdbc:phoenix:localhost> delete from t where a in (select a from t);
Error: com.salesforce.phoenix.parse.InParseNode[A,
com.salesforce.phoenix.parse.SubqueryParseNode@4b96fac5] (state=,code=0)

jdbc:phoenix:localhost> delete from t where (a,b) in (select a,b from t);
Error:
com.salesforce.phoenix.parse.InParseNode[com.salesforce.phoenix.parse.RowValueConstructorParseNode[A,
B], com.salesforce.phoenix.parse.SubqueryParseNode@51436be2] (state=,code=0)


The following statement works, but the row is never deleted? Bug?

$ delete from t where (a,b) in (1,1);

The following statement does in fact work and also delete the row.

$ delete from t where (a) in (1);







On Fri, Mar 28, 2014 at 6:19 PM, James Taylor <ja...@apache.org>wrote:

> Hi Kristoffer,
> No, this isn't a known or normal issue. What version of Phoenix and HBase
> are you using? What does your DELETE statement look like? Perf-wise, it's
> best if you can:
> * Batch deletes (any mutation, really) to cut down on RPC traffic. The
> phoenix.mutate.batchSize config parameter controls that and defaults to
> 1000.
> * Run as few DELETE statements as possible. For example, if you're
> deleting by PK, then form a single DELETE statement with an IN clause in
> the WHERE clause.
> * Set auto commit on for your connection: conn.setAutoCommit(true)
> assuming you don't have to issue a DELETE statement for each row separately.
> * If you do have to issue separate DELETE statement for each row, then
> keep auto commit as false and instead collect up batches of 1000 before
> issuing a commit.
>
> If you're doing all that, and it's still slow, check your logs for
> exceptions and perhaps try issuing a delete from the HBase shell against
> your table as a test.
>
> Thanks,
> James
>
>
> On Fri, Mar 28, 2014 at 8:46 AM, Kristoffer Sjögren <st...@gmail.com>wrote:
>
>> Hi
>>
>> I was doing some data migration today on phoenix and noticed that delete
>> ... where statements are _really_ slow (around 1 second per row).
>>
>> Is this a known issue? Our installation works fine for inserts and
>> selects.
>>
>> Cheers,
>> -Kristoffer
>>
>
>

Re: Slow delete

Posted by James Taylor <ja...@apache.org>.
Hi Kristoffer,
No, this isn't a known or normal issue. What version of Phoenix and HBase
are you using? What does your DELETE statement look like? Perf-wise, it's
best if you can:
* Batch deletes (any mutation, really) to cut down on RPC traffic. The
phoenix.mutate.batchSize config parameter controls that and defaults to
1000.
* Run as few DELETE statements as possible. For example, if you're deleting
by PK, then form a single DELETE statement with an IN clause in the WHERE
clause.
* Set auto commit on for your connection: conn.setAutoCommit(true) assuming
you don't have to issue a DELETE statement for each row separately.
* If you do have to issue separate DELETE statement for each row, then keep
auto commit as false and instead collect up batches of 1000 before issuing
a commit.

If you're doing all that, and it's still slow, check your logs for
exceptions and perhaps try issuing a delete from the HBase shell against
your table as a test.

Thanks,
James


On Fri, Mar 28, 2014 at 8:46 AM, Kristoffer Sjögren <st...@gmail.com>wrote:

> Hi
>
> I was doing some data migration today on phoenix and noticed that delete
> ... where statements are _really_ slow (around 1 second per row).
>
> Is this a known issue? Our installation works fine for inserts and selects.
>
> Cheers,
> -Kristoffer
>