You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@trafodion.apache.org by "Selvaganesan Govindarajan (JIRA)" <ji...@apache.org> on 2018/10/30 16:44:00 UTC

[jira] [Commented] (TRAFODION-3225) Obscure cores seen in RMS and logger related code when Trafodion is stressed

    [ https://issues.apache.org/jira/browse/TRAFODION-3225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16669020#comment-16669020 ] 

Selvaganesan Govindarajan commented on TRAFODION-3225:
------------------------------------------------------

The RMS related core dumps is diagnosed as follows:

The query fragment or the registered process in the shared segment is deregistered in the shared segment though the process continues to exist. Hence the stale references to the shared segment in the process causes core dump and at times can bring down the trafodion node.

> Obscure cores seen in RMS and logger related code when Trafodion is stressed
> ----------------------------------------------------------------------------
>
>                 Key: TRAFODION-3225
>                 URL: https://issues.apache.org/jira/browse/TRAFODION-3225
>             Project: Apache Trafodion
>          Issue Type: Bug
>          Components: sql-exe
>            Reporter: Selvaganesan Govindarajan
>            Assignee: Selvaganesan Govindarajan
>            Priority: Major
>
> During stress testing of enterprise edition of Trafodion, the following problems are seen.
> {color:#000000}Thread 1 (Thread 0x7efee4046700 (LWP 26304)):{color}
> {color:#000000}#0  0x00007eff1ad045f7 in raise () from /lib64/libc.so.6{color}
> {color:#000000}#1  0x00007eff1ad05e28 in abort () from /lib64/libc.so.6{color}
> {color:#000000}#2  0x00007eff1609b94e in assert_botch_abend (f=0x7eff1a7eac75 "../cli/Statement.cpp", l=6178, m=0x7eff1a7eb9e0 "StmtStats_ is null after addQuery", c=0x0) at ../export/NAAbort.cpp:285{color}
> {color:#000000}#3  0x00007eff1a777df9 in Statement::setStmtStats (this=0x7efee2bd94d0, autoRetry=0) at ../cli/Statement.cpp:6178{color}
> {color:#000000}#4  0x00007eff1a6d08ed in SQLCLI_ExecDirect2(CliGlobals *, SQLSTMT_ID *, SQLDESC_ID *, Int32, SQLDESC_ID *, Lng32, typedef __va_list_tag __va_list_tag *, SQLCLI_PTR_PAIRS *) (cliGlobals=0x22691d0, statement_id=0x4d45068, sql_source=0x7efee40420f0, prepFlags=0, input_descriptor=0x0, num_ptr_pairs=0, ap=0x7efee4041e90, ptr_pairs=0x0) at ../cli/Cli.cpp:3317{color}
> {color:#000000}#5  0x00007eff1a789c88 in SQL_EXEC_ExecDirect2 (statement_id=0x4d45068, sql_source=0x7efee40420f0, prep_flags=0, input_descriptor=0x0, num_ptr_pairs=0) at ../cli/CliExtern.cpp:2090{color}
> {color:#000000}#6  0x00007eff1d4c30ab in SRVR::WSQL_EXEC_ExecDirect (statement_id=0x4d45068, sql_source=0x7efee40420f0, input_descriptor=0x0, num_ptr_pairs=0) at SQLWrapper.cpp:364{color}
> {color:#000000}#7  0x00007eff1d4aaa8b in SRVR::EXECDIRECT (pSrvrStmt=0x4d44a50) at sqlinterface.cpp:4700{color}
> {color:#000000}#8  0x00007eff1d438280 in SRVR::ControlProc (pParam=0x4d44a50) at csrvrstmt.cpp:768{color}
> {color:#000000}#9  0x00007eff1d4378b7 in SRVR_STMT_HDL::ExecDirect (this=0x4d44a50, inCursorName=0x0, inSqlString=0x5f64aa8 "update Trafodion.\"_REPOS_\".metric_query_aggr_table set AGGREGATION_LAST_UPDATE_UTC_TS = CONVERTTIMESTAMP(212406077134960312),AGGREGATION_LAST_ELAPSED_TIME = 60000,TOTAL_EST_ROWS_ACCESSED = 0,TOTAL_EST"..., inStmtType=1, inSqlStmtType=0, inSqlAsyncEnable=0, inQueryTimeout=0) at csrvrstmt.cpp:450{color}
> {color:#000000}#10 0x000000000056f059 in SessionWatchDog (arg=0x0) at SrvrConnect.cpp:1194{color}
> {color:#000000}#11 0x00007eff1dbe1dc5 in start_thread () from /lib64/libpthread.so.0{color}
> {color:#000000}#12 0x00007eff1adc5ced in clone () from /lib64/libc.so.6{color}
>  
> {color:#000000}And other obscure cores related to ExStatisticsArea.{color}
> {color:#000000} {color}
> {color:#000000}The logger infrastructure fails with the following stack trace or some other variations in the logger code.{color}
>  
> #0  0x00007f4afb8bb495 in raise () from /lib64/libc.so.6
> #1  0x00007f4afb8bcc75 in abort () from /lib64/libc.so.6
> #2  0x00007f4afa705a8d in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib64/libstdc++.so.6
> #3  0x00007f4afa703be6 in ?? () from /usr/lib64/libstdc++.so.6
> #4  0x00007f4afa703c13 in std::terminate() () from /usr/lib64/libstdc++.so.6
> #5  0x00007f4afa70456f in __cxa_pure_virtual () from /usr/lib64/libstdc++.so.6
> #6  0x00007f4afb203f60 in ?? () from /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.161-3.b14.el6_9.x86_64/jre/lib/amd64/server/libjvm.so
> #7  0x00007f4afb38ba9f in ?? () from /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.161-3.b14.el6_9.x86_64/jre/lib/amd64/server/libjvm.so
> #8  0x00007f4afb38d47f in ?? () from /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.161-3.b14.el6_9.x86_64/jre/lib/amd64/server/libjvm.so
> #9  0x00007f4afb200ef2 in JVM_handle_linux_signal () from /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.161-3.b14.el6_9.x86_64/jre/lib/amd64/server/libjvm.so
> #10 0x00007f4afb1f6753 in ?? () from /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.161-3.b14.el6_9.x86_64/jre/lib/amd64/server/libjvm.so
> #11 <signal handler called>
> #12 0x00007f4ad6561e2c in log4cxx::helpers::Transcoder::decode (src=
>     "SQL.HBas0\000\000\000\000\000\000\000\060\000\000\000\000\000\000\000\200yh\002\000\000\000\000\b\000\000\000\000\000\000\000\377\377\377\377\000\000\000\000SQL.HDFS`\000\000\000\000\000\000\000\060\000\000\000\000\000\000\000\020eh\002\000\000\000\000\016\000\000\000\000\000\000\000\377\377\377\377\000\000\000\000SQL.EXE.0\005\000\000\000\000\000\000@\000\000\000\000\000\000\000\200\210\210\004\000\000\000\000\a\000\000\000\000\000\000\000\377\377\377\377\000\000\000\000SQL.Qmp\000\376\377\377\377\066\300\n\360org.apacp\005\000\000\000\000\000\000\060\000\000\000\000\000\000\000\017\000\000\000\000\000\000\000\017", '\000' <repeats 15 times>, "orc_proto.proto\000A", '\000' <repeats 11 times>..., dst="SQL.HBas0\000\000\000\000\000\000\000\060\000\000\000\000\000\000")
>     at transcoder.cpp:261
> #13 0x00007f4ad6517ca1 in log4cxx::LogManager::getLogger (name=<value optimized out>) at logmanager.cpp:120
> #14 0x00007f4ad6510c49 in log4cxx::Logger::getLogger (name=<value optimized out>) at logger.cpp:490
> #15 0x00007f4ad30fc9a1 in QRLogger::log (cat=
>     "SQL.HBas0\000\000\000\000\000\000\000\060\000\000\000\000\000\000\000\200yh\002\000\000\000\000\b\000\000\000\000\000\000\000\377\377\377\377\000\000\000\000SQL.HDFS`\000\000\000\000\000\000\000\060\000\000\000\000\000\000\000\020eh\002\000\000\000\000\016\000\000\000\000\000\000\000\377\377\377\377\000\000\000\000SQL.EXE.0\005\000\000\000\000\000\000@\000\000\000\000\000\000\000\200\210\210\004\000\000\000\000\a\000\000\000\000\000\000\000\377\377\377\377\000\000\000\000SQL.Qmp\000\376\377\377\377\066\300\n\360org.apacp\005\000\000\000\000\000\000\060\000\000\000\000\000\000\000\017\000\000\000\000\000\000\000\017", '\000' <repeats 15 times>, "orc_proto.proto\000A", '\000' <repeats 11 times>..., level=LL_DEBUG,
>     logMsgTemplate=0x7f4ad96d2430 "ExpHbaseInterface_JNI::init() creating new client.") at ../qmscommon/QRLogger.cpp:567
> #16 0x00007f4ad8e1e6b4 in ExpHbaseInterface_JNI::init (this=0x7f4ac7ef4870, hbs=0x0) at ../exp/ExpHbaseInterface.cpp:488
>  
>  
> {color:#000000} {color}
> {color:#000000} {color}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)