You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@trafodion.apache.org by "Suresh Subbiah (JIRA)" <ji...@apache.org> on 2015/10/05 17:45:26 UTC
[jira] [Updated] (TRAFODION-648) LP Bug: 1371670 - Use of bulk load for ustat causes a core in some cases

     [ https://issues.apache.org/jira/browse/TRAFODION-648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Suresh Subbiah updated TRAFODION-648:
-------------------------------------
    Assignee: David Wayne Birdsall  (was: Barry Fritchman)

> LP Bug: 1371670 - Use of bulk load for ustat causes a core in some cases
> ------------------------------------------------------------------------
>
>                 Key: TRAFODION-648
>                 URL: https://issues.apache.org/jira/browse/TRAFODION-648
>             Project: Apache Trafodion
>          Issue Type: Bug
>          Components: sql-cmp
>            Reporter: Apache Trafodion
>            Assignee: David Wayne Birdsall
>             Fix For: 2.0-incubating
>
>
> in some cases we noticed that the use of bulk load with update statistics makes the compiler generate a core file. The update statistics operation continues and the sample table is populated.
> The issue was seen on zircon2 and on Amethyst with 2 different tables. On Zircon4 the issue did not happen when I tried to run the same update statistics stament as the one that produced the core on zircon2
> the stack is below and my initial debugging showed that the issue happens when we try to do cleanup in NATable.cpp
>   for(i=0; i < tablesToDeleteAfterStatement_.entries(); i++)
>   {
>     if ( tablesToDeleteAfterStatement_[i]->getHeapType() == NATable::OTHER ) {
>       tableHeap = tablesToDeleteAfterStatement_[i]->heap_;
>       delete tableHeap;
>     }
>   } 
> in the case I debugged with Barry  it looks like when we try to delete the 3rd item in the list we fail because it was already deleted. it looks like  1st and 3rd element are pointing to same object and when we delete the first one the 3rd element is now pointing to a non exiting object
> [Thread debugging using libthread_db enabled]
> Core was generated by `tdm_arkcmp SQMON1.0 00000 00000 011902 $Z0009Q2 tag#0$port#52331$description#n0'.
> Program terminated with signal 6, Aborted.
> #0  0x00007fffee9a38a5 in raise () from /lib64/libc.so.6
> #0  0x00007fffee9a38a5 in raise () from /lib64/libc.so.6
> #1  0x00007fffee9a500d in abort () from /lib64/libc.so.6
> #2  0x00007ffff138f455 in os::abort(bool) () from /usr/lib/jvm/jdk1.7.0_09_64/jre/lib/amd64/server/libjvm.so
> #3  0x00007ffff14ef717 in VMError::report_and_die() () from /usr/lib/jvm/jdk1.7.0_09_64/jre/lib/amd64/server/libjvm.so
> #4  0x00007ffff1392f60 in JVM_handle_linux_signal () from /usr/lib/jvm/jdk1.7.0_09_64/jre/lib/amd64/server/libjvm.so
> #5  <signal handler called>
> #6  NATableDB::resetAfterStatement (this=0x7fffe79683b0) at ../optimizer/NATable.cpp:7559
> #7  0x00007ffff4f712df in SchemaDB::cleanupPerStatement (this=0x7fffe79683a0) at ../optimizer/SchemaDB.cpp:186
> #8  0x00007ffff4127735 in CmpContext::cleanup (this=0x7fffe7963090, exception=<value optimized out>) at ../arkcmp/CmpContext.cpp:489
> #9  0x00007ffff4129f63 in CmpContext::unsetStatement (this=0x7fffe7963090, s=0x7fffe7990c10, exceptionRaised=0) at ../arkcmp/CmpContext.cpp:453
> #10 0x00007ffff4134e46 in CmpStatement::~CmpStatement (this=0x7fffe7990c10, __in_chrg=<value optimized out>) at ../arkcmp/CmpStatement.cpp:224
> #11 0x00007ffff4134f11 in CmpStatement::~CmpStatement (this=0x7fffe7990c10, __in_chrg=<value optimized out>) at ../arkcmp/CmpStatement.cpp:227
> #12 0x00007ffff41251b9 in ExCmpMessage::actOnReceive (this=0x7fffffffc250) at ../arkcmp/CmpConnection.cpp:588
> #13 0x00007ffff6fdca56 in IpcMessageStream::internalActOnReceive (this=0x7fffffffc250, buffer=<value optimized out>, connection=0xbaadb0) at ../common/Ipc.cpp:3553
> #14 0x00007ffff6ff3aab in GuaConnectionToClient::acceptBuffer (this=0xbaadb0, buffer=<value optimized out>, receivedDataLength=<value optimized out>) at ../common/IpcGuardian.cpp:2467
> #15 0x00007ffff6ff47af in GuaReceiveControlConnection::wait (this=0xb9a5e0, timeout=-1, eventConsumed=<value optimized out>, ipcAwaitiox=0x7fffffffbc00) at ../common/IpcGuardian.cpp:3164
> #16 0x00007ffff6ff5b92 in GuaConnectionToClient::wait (this=0xbaadb0, timeout=<value optimized out>, eventConsumed=0x0, ipcAwaitiox=0x0) at ../common/IpcGuardian.cpp:2136
> #17 0x00007ffff6fe91aa in IpcSetOfConnections::waitOnSet (this=0x7fffffffc3f0, timeout=-1, calledByESP=0, timedout=0x0) at ../common/Ipc.cpp:1709
> #18 0x00007ffff6fe9ced in IpcMessageStream::waitOnMsgStream (this=0x7fffffffc250, timeout=-1) at ../common/Ipc.cpp:3272
> #19 0x00007ffff6fea032 in IpcMessageStream::receive (this=0x7fffffffc250, waited=1) at ../common/Ipc.cpp:3254
> #20 0x00000000004048ae in main (argc=2, argv=0x7fffffffc9c8) at ../bin/arkcmp.cpp:303
> to reproduce you can use zircon2  and the statement 
> update statistics for table trafodion.bench60.ycsb_table_20 on every key generate 1 intervals sample 1000 rows
> or amethyst 5 and  and do an update statistics on the ossdba.box table



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)