You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@trafodion.apache.org by "Hans Zeller (JIRA)" <ji...@apache.org> on 2016/03/25 03:33:25 UTC

[jira] [Reopened] (TRAFODION-1023) LP Bug: 1425661 - Hang with hive scan and [FIRST N]

     [ https://issues.apache.org/jira/browse/TRAFODION-1023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hans Zeller reopened TRAFODION-1023:
------------------------------------
      Assignee: Hans Zeller  (was: Apache Trafodion)

Anu and Rao ran into this issue again this week and I am working on a fix, so I think this issue is not yet resolved.

> LP Bug: 1425661 - Hang with hive scan and [FIRST N]
> ---------------------------------------------------
>
>                 Key: TRAFODION-1023
>                 URL: https://issues.apache.org/jira/browse/TRAFODION-1023
>             Project: Apache Trafodion
>          Issue Type: Bug
>          Components: sql-exe
>            Reporter: Apache Trafodion
>            Assignee: Hans Zeller
>            Priority: Critical
>
> A SELECT statement using [FIRST n] can fail to cleanup and hang when the main thread deallocates the statement. A reader thread can be observer to be waiting on a new buffer:
> (gdb) fr 2
> #2  0x00007ffff30bd7c7 in ExLobGlobals::performRequest (this=0x217f7e0,
>     request=<optimized out>) at ../exp/ExpLOBaccess.cpp:1859
> 1859                cursor->lock_.wait();
> (gdb) list
> #0  0x0000003d60c0b43c in pthread_cond_wait@@GLIBC_2.3.2 ()
>    from /lib64/libpthread.so.0
> #1  0x00007ffff30b797d in ExLobLock::wait (this=0x217fef8)
>     at ../exp/ExpLOBaccess.cpp:2246
> #2  0x00007ffff30bd7c7 in ExLobGlobals::performRequest (this=0x217f7e0,
>     request=<optimized out>) at ../exp/ExpLOBaccess.cpp:1859
> #3  0x00007ffff30bd8b9 in ExLobGlobals::doWorkInThread (this=0x217f7e0)
>     at ../exp/ExpLOBaccess.cpp:2397
> #4  0x00007ffff30bd909 in workerThreadMain (arg=<optimized out>)
>     at ../exp/ExpLOBaccess.cpp:2048
> #5  0x0000003d60c07851 in start_thread () from /lib64/libpthread.so.0
> #6  0x0000003d608e890d in clone () from /lib64/libc.so.6
> 1854              // there are no empty buffers.
> 1855              // if prefetch list already has the max, wait for one to freeup.
> 1856              totalBufSize =  cursor->prefetchBufList_.size() * cursor->bufMaxSize_;
> 1857              if (totalBufSize > LOB_CURSOR_PREFETCH_BYTES_MAX) {
> 1858                traceMessage("wait on condition cursor",__LINE__);
> 1859                cursor->lock_.wait();
> 1860                continue;
> 1861              }
> The main thread's backtrace in the hang:
> #0  0x0000003d60c080ad in pthread_join () from /lib64/libpthread.so.0
> #1  0x00007ffff30b9d2c in ExLobGlobals::~ExLobGlobals (this=0x217f7e0,
>     __in_chrg=<optimized out>) at ../exp/ExpLOBaccess.cpp:1973
> #2  0x00007ffff30be711 in ExLobsOper (lobName=0x7fffffff1780 "/h/temp",
>     handleIn=0x0, handleInLen=0, hdfsServer=0x0, hdfsPort=0, handleOut=0x0,
>     handleOutLen=@0x7fffffff17f0: 0, descNumIn=0,
>     descNumOut=@0x7fffffff17f0: 0, retOperLen=@0x7fffffff17f0: 0,
>     requestTagIn=0, requestTagOut=@0x7fffffff17f0: 0,
>     requestStatus=@0x7fffffff17fc: 32767, cliError=@0x7fffffff17e8: -1,
>     dir=0x7fffffff1780 "/h/temp", storage=Lob_HDFS_File, source=0x0,
>     sourceLen=0, cursorBytes=0, cursorId=0x0, operation=Lob_Cleanup,
>     subOperation=Lob_None, waited=1, globPtr=@0x7fffd6fd6ab8: 0x217f7e0,
>     transId=0, blackBox=0x0, blackBoxLen=0, bufferSize=0, replication=0,
>     blockSize=0) at ../exp/ExpLOBaccess.cpp:2732
> #3  0x00007ffff30c0c86 in ExpLOBinterfaceCleanup (lobGlob=<optimized out>,
>     lobHeap=<optimized out>) at ../exp/ExpLOBinterface.cpp:100
> #4  0x00007ffff4cd0b8d in ExHdfsScanTcb::~ExHdfsScanTcb (this=0x7fffd6fd69b0,
>     __in_chrg=<optimized out>) at ../executor/ExHdfsScan.cpp:178
> #5  0x00007ffff4cd0ca1 in ExHdfsScanTcb::~ExHdfsScanTcb (this=0x7fffd6fd69b0,
>     __in_chrg=<optimized out>) at ../executor/ExHdfsScan.cpp:179
> #6  0x00007ffff4b5ba20 in ex_globals::cleanupTcbs (this=0x7fffe96b5700)
>     at ../executor/ex_globals.cpp:192
> #7  0x00007ffff4b5eefd in ex_globals::deleteMe (this=0x7fffe96b5700)
>     at ../executor/ex_globals.cpp:138
> #8  0x00007ffff4b448c9 in ExExeStmtGlobals::deleteMe (this=0x7fffe96b5700)
>     at ../executor/ex_exe_stmt_globals.cpp:303
> #9  0x00007ffff4b44bc9 in ExMasterStmtGlobals::deleteMe (this=0x7fffe96b5700)
>     at ../executor/ex_exe_stmt_globals.cpp:654
> #10 0x00007ffff4b84b46 in ex_root_tcb::deallocAndDelete (
>     this=<optimized out>, glob=0x7fffe96b5700, fragTable=<optimized out>)
>     at ../executor/ex_root.cpp:2430
> #11 0x00007ffff5fa86b7 in CliStatement::releaseTcbs (this=<optimized out>,
>     closeAllOpens=<optimized out>) at ../cli/Statement.cpp:6056
> #12 0x00007ffff5fa8843 in CliStatement::dealloc (this=0x7fffe96c1f90,
>     closeAllOpens=0) at ../cli/Statement.cpp:6104
> #13 0x00007ffff5fa8f24 in CliStatement::~CliStatement (this=0x7fffe96c1f90,
>     __in_chrg=<optimized out>) at ../cli/Statement.cpp:571
> #14 0x00007ffff5fa94f1 in CliStatement::~CliStatement (this=0x7fffe96c1f90,
>     __in_chrg=<optimized out>) at ../cli/Statement.cpp:741
> #15 0x00007ffff5f73a28 in ContextCli::deallocStmt (this=<optimized out>,
>     statement_id=0x1f925d0, deallocStaticStmt=0) at ../cli/Context.cpp:2328
> #16 0x00007ffff5f59ecf in SQLCLI_DeallocStmt (cliGlobals=<optimized out>,
>     statement_id=0x1f925d0) at ../cli/Cli.cpp:1649
> #17 0x00007ffff5fba706 in SQL_EXEC_DeallocStmt (statement_id=0x1f925d0)
>     at ../cli/CliExtern.cpp:1823
> I added a test case into regress/hive/TEST003 to demo this scenario; it will be available when I check in the fix.
> Assigned to LaunchPad User Mike Hanlon



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)