You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hawq.apache.org by "Lei Chang (JIRA)" <ji...@apache.org> on 2016/01/24 02:23:40 UTC
[jira] [Updated] (HAWQ-252) Coredump When RM Reconnect libyarn
[ https://issues.apache.org/jira/browse/HAWQ-252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Lei Chang updated HAWQ-252:
---------------------------
Fix Version/s: 2.0.0
> Coredump When RM Reconnect libyarn
> ----------------------------------
>
> Key: HAWQ-252
> URL: https://issues.apache.org/jira/browse/HAWQ-252
> Project: Apache HAWQ
> Issue Type: Bug
> Components: Resource Manager
> Reporter: Lin Wen
> Assignee: Lin Wen
> Fix For: 2.0.0
>
>
> Coredump When RM Reconnect libyarn
> Missing separate debuginfos, use: debuginfo-install hawq-2.0.0.0_beta-19011.x86_64
> (gdb) bt
> #0 0x0000000000e661f8 in std::string::_Rep::_S_empty_rep_storage ()
> #1 0x00007f7f1f20947c in libyarn::LibYarnClient::dummyAllocate (this=<value optimized out>)
> at /data1/pulse2-agent/agents/agent1/work/LIBYARN-main-opt/rhel5_x86_64/src/libyarnclient/LibYarnClient.cpp:330
> #2 0x00007f7f1f209988 in libyarn::heartbeatFunc (args=<value optimized out>)
> at /data1/pulse2-agent/agents/agent1/work/LIBYARN-main-opt/rhel5_x86_64/src/libyarnclient/LibYarnClient.cpp:114
> #3 0x000000350b4079d1 in start_thread () from /lib64/libpthread.so.0
> #4 0x000000350b0e8b6d in clone () from /lib64/libc.so.6
> (gdb) info thread
> 4 Thread 0x7f7efc239700 (LWP 760442) 0x000000350b40b98e in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
> 3 Thread 0x7f7f1a1758c0 (LWP 760441) 0x000000350b0accdd in nanosleep () from /lib64/libc.so.6
> 2 Thread 0x7f7efae37700 (LWP 760797) 0x000000350b0accdd in nanosleep () from /lib64/libc.so.6
> * 1 Thread 0x7f7efb838700 (LWP 760443) 0x0000000000e661f8 in std::string::_Rep::_S_empty_rep_storage ()
> (gdb) thread 2
> [Switching to thread 2 (Thread 0x7f7efae37700 (LWP 760797))]#0 0x000000350b0accdd in nanosleep () from /lib64/libc.so.6
> (gdb) bt
> #0 0x000000350b0accdd in nanosleep () from /lib64/libc.so.6
> #1 0x000000350b0e1e54 in usleep () from /lib64/libc.so.6
> #2 0x00007f7f1f209999 in libyarn::heartbeatFunc (args=<value optimized out>)
> at /data1/pulse2-agent/agents/agent1/work/LIBYARN-main-opt/rhel5_x86_64/src/libyarnclient/LibYarnClient.cpp:131
> #3 0x000000350b4079d1 in start_thread () from /lib64/libpthread.so.0
> #4 0x000000350b0e8b6d in clone () from /lib64/libc.so.6
> (gdb) thread 3
> [Switching to thread 3 (Thread 0x7f7f1a1758c0 (LWP 760441))]#0 0x000000350b0accdd in nanosleep () from /lib64/libc.so.6
> (gdb) bt
> #0 0x000000350b0accdd in nanosleep () from /lib64/libc.so.6
> #1 0x000000350b0e1e54 in usleep () from /lib64/libc.so.6
> #2 0x00000000008dd8b9 in RB2YARN_registerYARNApplication () at resourcebroker_LIBYARN_proc.c:1354
> #3 0x00000000008df8ad in RB2YARN_initializeConnection () at resourcebroker_LIBYARN_proc.c:1270
> #4 0x00000000008dfc93 in ResBrokerMainInternal () at resourcebroker_LIBYARN_proc.c:202
> #5 0x00000000008dff79 in ResBrokerMain () at resourcebroker_LIBYARN_proc.c:157
> #6 0x00000000008dc246 in RB_LIBYARN_start (isforked=<value optimized out>) at resourcebroker_LIBYARN.c:153
> #7 0x0000000000903bda in MainHandlerLoop () at resourcemanager.c:531
> #8 0x00000000009041f1 in ResManagerMainServer2ndPhase () at resourcemanager.c:508
> #9 0x0000000000904624 in ResManagerMain (argc=<value optimized out>, argv=<value optimized out>) at resourcemanager.c:330
> #10 0x00000000009049b1 in ResManagerProcessStartup () at resourcemanager.c:402
> #11 0x0000000000764b08 in CommenceNormalOperations () at postmaster.c:3616
> #12 0x00000000007659c2 in do_reaper () at postmaster.c:3964
> #13 0x000000000076a01d in ServerLoop () at postmaster.c:2102
> #14 0x000000000076bb5e in PostmasterMain (argc=9, argv=0x32a15b0) at postmaster.c:1421
> #15 0x00000000006c691a in main (argc=9, argv=0x32a1570) at main.c:226
> There are two heartbeat thread at this moment, which means one heartbeat thread hasn't be canceled when RM reconnects libyarn.
> In function ResBrokerMainInternal(), from line:270, should cancel the heartbeat thread before call RB2YARN_disconnectFromYARN
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)