You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@trafodion.apache.org by "Selvaganesan Govindarajan (JIRA)" <ji...@apache.org> on 2017/10/20 19:01:00 UTC

[jira] [Commented] (TRAFODION-2780) Mxosrvr dumps core when connection idle timer expires at times

    [ https://issues.apache.org/jira/browse/TRAFODION-2780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16213061#comment-16213061 ] 

Selvaganesan Govindarajan commented on TRAFODION-2780:
------------------------------------------------------

Analysis of the core from the customer shows the following:

gdb) p cli_globals->defaultContext_ ->hbaseClientJNI_ ->tid_ 
$3 = 47059

But the connection idle timer kicks in thread 47052 and attempts to close the opened statements in the connection as part of connection end processing. The statement that is being closed is a select statement which wasn't closed by the application.
,
Closing the statement from a different thread is in violation of SQL concepts because of thread specific opens. This need to be fixed in mxosrvr so that the connection close processing as part of ConnectionIdleTimer can happen in the SQL thread 47059. Disconnect from the client would have happened in 47059.

> Mxosrvr dumps core when connection idle timer expires at times
> --------------------------------------------------------------
>
>                 Key: TRAFODION-2780
>                 URL: https://issues.apache.org/jira/browse/TRAFODION-2780
>             Project: Apache Trafodion
>          Issue Type: Bug
>          Components: connectivity-mxosrvr
>            Reporter: Selvaganesan Govindarajan
>            Assignee: Selvaganesan Govindarajan
>
> Mxosrvr dumps core at times when the connection idle timer expires with the following stack trace. This core is accompanied by mxssmp core.
> Thread 1 (Thread 0x7f95e8cd4a00 (LWP 47052)):
> #0  0x00007f95e4b135f7 in raise () from /lib64/libc.so.6
> #1  0x00007f95e4b14e28 in abort () from /lib64/libc.so.6
> #2  0x00007f95e1158bef in assert_botch_abend (f=f@entry=0x7f95e2ee50d7 "../executor/ex_root.cpp", l=l@entry=3055, 
>     m=m@entry=0x7f95e2ee5338 "Timeout waiting for control broker.", c=c@entry=0x0) at ../export/NAAbort.cpp:277
> #3  0x00007f95e2da911b in ex_root_tcb::cbMessageWait (this=0x7f95af1d4d78, 
>     waitStartTime=waitStartTime@entry=212373626065770962) at ../executor/ex_root.cpp:3055
> #4  0x00007f95e44515c4 in CliStatement::releaseTransaction (this=this@entry=0x7f95e8b33d70, 
>     allWorkRequests=allWorkRequests@entry=1, alwaysSendReleaseMsg=alwaysSendReleaseMsg@entry=0, 
>     statementRemainsOpen=statementRemainsOpen@entry=0) at ../cli/Statement.cpp:965
> #5  0x00007f95e4451990 in CliStatement::releaseTcbs (this=this@entry=0x7f95e8b33d70, 
>     closeAllOpens=closeAllOpens@entry=0) at ../cli/Statement.cpp:4306
> #6  0x00007f95e4451b33 in CliStatement::dealloc (this=this@entry=0x7f95e8b33d70, 
>     closeAllOpens=closeAllOpens@entry=0) at ../cli/Statement.cpp:4394
> #7  0x00007f95e445269a in CliStatement::close (this=this@entry=0x7f95e8b33d70, diagsArea=..., 
>     inRollback=inRollback@entry=0) at ../cli/Statement.cpp:1140
> #8  0x00007f95e4411704 in SQLCLI_PerformTasks(CliGlobals *, ULng32, SQLSTMT_ID *, SQLDESC_ID *, SQLDESC_ID *, Lng32, Lng32, typedef __va_list_tag __va_list_tag *, SQLCLI_PTR_PAIRS *, SQLCLI_PTR_PAIRS *) (cliGlobals=<optimized out>, 
>     tasks=1800, statement_id=<optimized out>, input_descriptor=input_descriptor@entry=0x0, 
>     output_descriptor=output_descriptor@entry=0x0, num_input_ptr_pairs=num_input_ptr_pairs@entry=0, 
>     num_output_ptr_pairs=num_output_ptr_pairs@entry=0, ap=ap@entry=0x0, input_ptr_pairs=input_ptr_pairs@entry=0x0, 
>     output_ptr_pairs=output_ptr_pairs@entry=0x0) at ../cli/Cli.cpp:3465
> #9  0x00007f95e4411d04 in SQLCLI_CloseStmt (cliGlobals=<optimized out>, statement_id=<optimized out>)
>     at ../cli/Cli.cpp:3518
> #10 0x00007f95e445c83f in SQL_EXEC_CloseStmt (statement_id=0x10eed778) at ../cli/CliExtern.cpp:1432
> #11 0x00007f95e7379757 in SRVR::releaseCachedObject (internalStmt=internalStmt@entry=0, 
>     mxsrvr_substate=mxsrvr_substate@entry=NDCS_CONN_IDLE) at srvrcommon.cpp:764
> #12 0x00000000004bdf45 in SRVR::connIdleTimerExpired (timer_tag=<optimized out>) at SrvrConnect.cpp:4648
> #13 0x0000000000490362 in BUILD_TIMER_MSG_CALL (call_id_=<optimized out>, request=<optimized out>, 
>     countRead=<optimized out>, receive_info=<optimized out>) at ../Common/FileSystemSrvr.cpp:601
> #14 0x0000000000492075 in CNSKListener::CheckReceiveMessage (this=0x263bea0, cc=@0x7ffc86677a84: 6, countRead=16, 
>     call_id=<optimized out>) at ../Common/Listener.cpp:272
> #15 0x000000000049b14e in CNSKListenerSrvr::runProgram (this=0x263bea0, TcpProcessName=<optimized out>, 
>     port=<optimized out>, TransportTrace=<optimized out>) at Interface/linux/Listener_srvr_ps.cpp:508
> #16 0x0000000000483cc6 in runCEE (TransportTrace=0, portNumber=<optimized out>, 
>     TcpProcessName=0x7ffc866789d0 "$ZTC0") at SrvrMain.cpp:167
> #17 main (argc=39, argv=0x7ffc8667a948, envp=<optimized out>) at SrvrMain.cpp:864



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)