You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@trafodion.apache.org by Dave Birdsall <da...@esgyn.com> on 2015/11/17 20:29:33 UTC

Is anyone else seeing this?

Hi,



My development instance has been having problems dropping tables.



After some debugging, I’ve been able to reproduce the failure with just a
simple DELETE on a user table. See the log below.



When I do a naked DELETE, it succeeds. I notice that transID=0 is passed to
HBaseClient_JNI::checkAndDeleteRow.



But when I do a BEGIN WORK, then DELETE, it fails with the same sort of
error I was seeing on DROP TABLE. I notice in that case I have a non-zero
transID passed to HBaseClient_JNI::checkAndDeleteRow. I have noticed that
when I do DROP TABLE, I always have a non-zero transID here. Hence my
failures.



Is anyone else seeing this? I’m on a fairly recent Trafodion 2.0 baseline.



Dave





>>insert into t4 values (17,18),(1,0);



--- 2 row(s) inserted.

>>!delete;

>>delete from t4 where a = 1;



Breakpoint 3, HBaseClient_JNI::checkAndDeleteRow (this=0x7fffe81ba0c8,

    heap=0x7fffe81e36f8, tableName=0x7fffe81e6cd0 "TRAFODION.SEABASE.T4",

    hbs=0x101881f8, useTRex=true, transID=0, rowID=..., columnToCheck=...,

    columnValToCheck=..., timestamp=-1, asyncOperation=false,

    outHtc=0x7ffffffeffe8) at ../executor/HBaseClient_JNI.cpp:3364

3364    QRLogger::log(CAT_SQL_HBASE, LL_DEBUG,
"HBaseClient_JNI::checkAndDeleteRow(%ld, %s) called.", transID, rowID.val);

(gdb) c

Continuing.



--- 1 row(s) deleted.

>>begin work;



--- SQL operation complete.

>>delete from t4 where a = 17;



Breakpoint 3, HBaseClient_JNI::checkAndDeleteRow (this=0x7fffe81ba0c8,

    heap=0x7fffe81e36f8, tableName=0x7fffe81e6cd0 "TRAFODION.SEABASE.T4",

    hbs=0x10187e18, useTRex=true, transID=80016, rowID=...,
columnToCheck=...,

    columnValToCheck=..., timestamp=-1, asyncOperation=false,

    outHtc=0x7fffffff0048) at ../executor/HBaseClient_JNI.cpp:3364

3364    QRLogger::log(CAT_SQL_HBASE, LL_DEBUG,
"HBaseClient_JNI::checkAndDeleteRow(%ld, %s) called.", transID, rowID.val);

(gdb) c

Continuing.



*** ERROR[8448] Unable to access Hbase interface. Call to
ExpHbaseInterface::checkAndDeleteRow returned error
HBASE_ACCESS_ERROR(-706). Cause:

java.io.IOException: Coprocessor result is null, retries exhausted

org.apache.hadoop.hbase.client.transactional.TransactionalTable.checkAndDelete(TransactionalTable.java:486)

org.apache.hadoop.hbase.client.transactional.RMInterface.checkAndDelete(RMInterface.java:402)

org.trafodion.sql.HTableClient.checkAndDeleteRow(HTableClient.java:939)

org.trafodion.sql.HBaseClient.checkAndDeleteRow(HBaseClient.java:1554)

.



--- 0 row(s) deleted.

>>rollback work;



--- SQL operation complete.

>>

RE: Is anyone else seeing this?

Posted by Dave Birdsall <da...@esgyn.com>.
Hi,



Addendum: I found that if I stopped HBase (without stopping Trafodion) and
restarted it again, the failures went away.



Dave



*From:* Dave Birdsall [mailto:dave.birdsall@esgyn.com]
*Sent:* Tuesday, November 17, 2015 11:30 AM
*To:* 'dev@trafodion.incubator.apache.org' <
dev@trafodion.incubator.apache.org>
*Subject:* Is anyone else seeing this?



Hi,



My development instance has been having problems dropping tables.



After some debugging, I’ve been able to reproduce the failure with just a
simple DELETE on a user table. See the log below.



When I do a naked DELETE, it succeeds. I notice that transID=0 is passed to
HBaseClient_JNI::checkAndDeleteRow.



But when I do a BEGIN WORK, then DELETE, it fails with the same sort of
error I was seeing on DROP TABLE. I notice in that case I have a non-zero
transID passed to HBaseClient_JNI::checkAndDeleteRow. I have noticed that
when I do DROP TABLE, I always have a non-zero transID here. Hence my
failures.



Is anyone else seeing this? I’m on a fairly recent Trafodion 2.0 baseline.



Dave





>>insert into t4 values (17,18),(1,0);



--- 2 row(s) inserted.

>>!delete;

>>delete from t4 where a = 1;



Breakpoint 3, HBaseClient_JNI::checkAndDeleteRow (this=0x7fffe81ba0c8,

    heap=0x7fffe81e36f8, tableName=0x7fffe81e6cd0 "TRAFODION.SEABASE.T4",

    hbs=0x101881f8, useTRex=true, transID=0, rowID=..., columnToCheck=...,

    columnValToCheck=..., timestamp=-1, asyncOperation=false,

    outHtc=0x7ffffffeffe8) at ../executor/HBaseClient_JNI.cpp:3364

3364    QRLogger::log(CAT_SQL_HBASE, LL_DEBUG,
"HBaseClient_JNI::checkAndDeleteRow(%ld, %s) called.", transID, rowID.val);

(gdb) c

Continuing.



--- 1 row(s) deleted.

>>begin work;



--- SQL operation complete.

>>delete from t4 where a = 17;



Breakpoint 3, HBaseClient_JNI::checkAndDeleteRow (this=0x7fffe81ba0c8,

    heap=0x7fffe81e36f8, tableName=0x7fffe81e6cd0 "TRAFODION.SEABASE.T4",

    hbs=0x10187e18, useTRex=true, transID=80016, rowID=...,
columnToCheck=...,

    columnValToCheck=..., timestamp=-1, asyncOperation=false,

    outHtc=0x7fffffff0048) at ../executor/HBaseClient_JNI.cpp:3364

3364    QRLogger::log(CAT_SQL_HBASE, LL_DEBUG,
"HBaseClient_JNI::checkAndDeleteRow(%ld, %s) called.", transID, rowID.val);

(gdb) c

Continuing.



*** ERROR[8448] Unable to access Hbase interface. Call to
ExpHbaseInterface::checkAndDeleteRow returned error
HBASE_ACCESS_ERROR(-706). Cause:

java.io.IOException: Coprocessor result is null, retries exhausted

org.apache.hadoop.hbase.client.transactional.TransactionalTable.checkAndDelete(TransactionalTable.java:486)

org.apache.hadoop.hbase.client.transactional.RMInterface.checkAndDelete(RMInterface.java:402)

org.trafodion.sql.HTableClient.checkAndDeleteRow(HTableClient.java:939)

org.trafodion.sql.HBaseClient.checkAndDeleteRow(HBaseClient.java:1554)

.



--- 0 row(s) deleted.

>>rollback work;



--- SQL operation complete.

>>

RE: Is anyone else seeing this?

Posted by Dave Birdsall <da...@esgyn.com>.
Hi Selva,

Thanks for responding. No, I've been purely debug all the time.

The one thing unusual that I did was I was using OSIM to debug a query.
After unloading OSIM I seemed to have problems. I even went so far as to
blow away all "TRAFODION*" files at the Hbase level and then did a fresh
"INITIALIZE TRAFODION" but the problem persisted. Until I stopped and
restarted Hbase.

Dave

-----Original Message-----
From: Selva Govindarajan [mailto:selva.govindarajan@esgyn.com]
Sent: Tuesday, November 17, 2015 11:38 AM
To: dev@trafodion.incubator.apache.org
Subject: RE: Is anyone else seeing this?

Hi Dave,

I have run the full regressions with Trafodion 2.0 with no issues. By any
chance have you mixed the debug and release environments - meaning started
Hbase when you were in release mode and then switched to debug mode to start
Trafodion instance or vice versa. I have encountered this kind of failures
when I switch modes earlier and solved it by restarting hbase in the same
mode as Trafodion instance.

Selva

-----Original Message-----
From: Dave Birdsall [mailto:dave.birdsall@esgyn.com]
Sent: Tuesday, November 17, 2015 11:30 AM
To: dev@trafodion.incubator.apache.org
Subject: Is anyone else seeing this?

Hi,



My development instance has been having problems dropping tables.



After some debugging, I’ve been able to reproduce the failure with just a
simple DELETE on a user table. See the log below.



When I do a naked DELETE, it succeeds. I notice that transID=0 is passed to
HBaseClient_JNI::checkAndDeleteRow.



But when I do a BEGIN WORK, then DELETE, it fails with the same sort of
error I was seeing on DROP TABLE. I notice in that case I have a non-zero
transID passed to HBaseClient_JNI::checkAndDeleteRow. I have noticed that
when I do DROP TABLE, I always have a non-zero transID here. Hence my
failures.



Is anyone else seeing this? I’m on a fairly recent Trafodion 2.0 baseline.



Dave





>>insert into t4 values (17,18),(1,0);



--- 2 row(s) inserted.

>>!delete;

>>delete from t4 where a = 1;



Breakpoint 3, HBaseClient_JNI::checkAndDeleteRow (this=0x7fffe81ba0c8,

    heap=0x7fffe81e36f8, tableName=0x7fffe81e6cd0 "TRAFODION.SEABASE.T4",

    hbs=0x101881f8, useTRex=true, transID=0, rowID=..., columnToCheck=...,

    columnValToCheck=..., timestamp=-1, asyncOperation=false,

    outHtc=0x7ffffffeffe8) at ../executor/HBaseClient_JNI.cpp:3364

3364    QRLogger::log(CAT_SQL_HBASE, LL_DEBUG,
"HBaseClient_JNI::checkAndDeleteRow(%ld, %s) called.", transID, rowID.val);

(gdb) c

Continuing.



--- 1 row(s) deleted.

>>begin work;



--- SQL operation complete.

>>delete from t4 where a = 17;



Breakpoint 3, HBaseClient_JNI::checkAndDeleteRow (this=0x7fffe81ba0c8,

    heap=0x7fffe81e36f8, tableName=0x7fffe81e6cd0 "TRAFODION.SEABASE.T4",

    hbs=0x10187e18, useTRex=true, transID=80016, rowID=...,
columnToCheck=...,

    columnValToCheck=..., timestamp=-1, asyncOperation=false,

    outHtc=0x7fffffff0048) at ../executor/HBaseClient_JNI.cpp:3364

3364    QRLogger::log(CAT_SQL_HBASE, LL_DEBUG,
"HBaseClient_JNI::checkAndDeleteRow(%ld, %s) called.", transID, rowID.val);

(gdb) c

Continuing.



*** ERROR[8448] Unable to access Hbase interface. Call to
ExpHbaseInterface::checkAndDeleteRow returned error
HBASE_ACCESS_ERROR(-706). Cause:

java.io.IOException: Coprocessor result is null, retries exhausted

org.apache.hadoop.hbase.client.transactional.TransactionalTable.checkAndDelete(TransactionalTable.java:486)

org.apache.hadoop.hbase.client.transactional.RMInterface.checkAndDelete(RMInterface.java:402)

org.trafodion.sql.HTableClient.checkAndDeleteRow(HTableClient.java:939)

org.trafodion.sql.HBaseClient.checkAndDeleteRow(HBaseClient.java:1554)

.



--- 0 row(s) deleted.

>>rollback work;



--- SQL operation complete.

>>

RE: Is anyone else seeing this?

Posted by Selva Govindarajan <se...@esgyn.com>.
Hi Dave,

I have run the full regressions with Trafodion 2.0 with no issues. By any
chance have you mixed the debug and release environments - meaning started
Hbase when you were in release mode and then switched to debug mode to start
Trafodion instance or vice versa. I have encountered this kind of failures
when I switch modes earlier and solved it by restarting hbase in the same
mode as Trafodion instance.

Selva

-----Original Message-----
From: Dave Birdsall [mailto:dave.birdsall@esgyn.com]
Sent: Tuesday, November 17, 2015 11:30 AM
To: dev@trafodion.incubator.apache.org
Subject: Is anyone else seeing this?

Hi,



My development instance has been having problems dropping tables.



After some debugging, I’ve been able to reproduce the failure with just a
simple DELETE on a user table. See the log below.



When I do a naked DELETE, it succeeds. I notice that transID=0 is passed to
HBaseClient_JNI::checkAndDeleteRow.



But when I do a BEGIN WORK, then DELETE, it fails with the same sort of
error I was seeing on DROP TABLE. I notice in that case I have a non-zero
transID passed to HBaseClient_JNI::checkAndDeleteRow. I have noticed that
when I do DROP TABLE, I always have a non-zero transID here. Hence my
failures.



Is anyone else seeing this? I’m on a fairly recent Trafodion 2.0 baseline.



Dave





>>insert into t4 values (17,18),(1,0);



--- 2 row(s) inserted.

>>!delete;

>>delete from t4 where a = 1;



Breakpoint 3, HBaseClient_JNI::checkAndDeleteRow (this=0x7fffe81ba0c8,

    heap=0x7fffe81e36f8, tableName=0x7fffe81e6cd0 "TRAFODION.SEABASE.T4",

    hbs=0x101881f8, useTRex=true, transID=0, rowID=..., columnToCheck=...,

    columnValToCheck=..., timestamp=-1, asyncOperation=false,

    outHtc=0x7ffffffeffe8) at ../executor/HBaseClient_JNI.cpp:3364

3364    QRLogger::log(CAT_SQL_HBASE, LL_DEBUG,
"HBaseClient_JNI::checkAndDeleteRow(%ld, %s) called.", transID, rowID.val);

(gdb) c

Continuing.



--- 1 row(s) deleted.

>>begin work;



--- SQL operation complete.

>>delete from t4 where a = 17;



Breakpoint 3, HBaseClient_JNI::checkAndDeleteRow (this=0x7fffe81ba0c8,

    heap=0x7fffe81e36f8, tableName=0x7fffe81e6cd0 "TRAFODION.SEABASE.T4",

    hbs=0x10187e18, useTRex=true, transID=80016, rowID=...,
columnToCheck=...,

    columnValToCheck=..., timestamp=-1, asyncOperation=false,

    outHtc=0x7fffffff0048) at ../executor/HBaseClient_JNI.cpp:3364

3364    QRLogger::log(CAT_SQL_HBASE, LL_DEBUG,
"HBaseClient_JNI::checkAndDeleteRow(%ld, %s) called.", transID, rowID.val);

(gdb) c

Continuing.



*** ERROR[8448] Unable to access Hbase interface. Call to
ExpHbaseInterface::checkAndDeleteRow returned error
HBASE_ACCESS_ERROR(-706). Cause:

java.io.IOException: Coprocessor result is null, retries exhausted

org.apache.hadoop.hbase.client.transactional.TransactionalTable.checkAndDelete(TransactionalTable.java:486)

org.apache.hadoop.hbase.client.transactional.RMInterface.checkAndDelete(RMInterface.java:402)

org.trafodion.sql.HTableClient.checkAndDeleteRow(HTableClient.java:939)

org.trafodion.sql.HBaseClient.checkAndDeleteRow(HBaseClient.java:1554)

.



--- 0 row(s) deleted.

>>rollback work;



--- SQL operation complete.

>>