You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Michael Ho (JIRA)" <ji...@apache.org> on 2019/02/20 23:54:00 UTC

[jira] [Updated] (KUDU-2706) Race in CanonicalizeKrb5Principal() due to lazy initialization of g_kinit_ctx->default_realm

     [ https://issues.apache.org/jira/browse/KUDU-2706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael Ho updated KUDU-2706:
-----------------------------
    Description: 
As far as I understand, the assumption is that {{g_krb5_ctx}} is global, shared and it should not be modified after initialization. However, various code in {{kudu::security}} make calls to some Kerberos code which may make modification to {{g_krb5_ctx}} inadvertently. 

The default initialization code {{krb5_init_context(&g_krb5_ctx)}} called by {{kudu::security::InitKrb5Ctx()}} only sets {{g_krb5_ctx->default_realm}} to 0. Upon the first call to {{krb5_parse_name()}}, the Kerberos library will call {{krb5_get_default_realm()}} to get the default realm as realm is {{NULL}}:

{noformat}
krb5_error_code KRB5_CALLCONV
krb5_get_default_realm(krb5_context context, char **realm_out)
{
    krb5_error_code ret;

    *realm_out = NULL;

    if (context == NULL || context->magic != KV5M_CONTEXT)
        return KV5M_CONTEXT;

    if (context->default_realm == NULL) {
        ret = get_default_realm(context, &context->default_realm); <<<----- // non-thread safe call
        if (ret)
            return ret;
    }
    *realm_out = strdup(context->default_realm);
    return (*realm_out == NULL) ? ENOMEM : 0;
}
{noformat}

Apparently, {{krb5_get_default_realm}} may modify {{g_krb5_context}} but it's not thread safe. So, if multiple negotiation threads get into the same code path of calling {{krb5_get_default_realm()}} at the same time, they may end up stepping on each other and corrupting {{g_krb5_ctx}}, leading to the crash seen in stack trace below or error messages like the following:

{noformat}
0216 14:26:07.459600 (+   296us) negotiation.cc:304] Negotiation complete: Runtime error: Server connection negotiation failed: server connection from X.X.X.X:37070: could not canonicalize krb5 principal: could not parse principal: Configuration file does not specify default realm
{noformat}

Stack trace showing 
{noformat}
#0  0x00007fb03e1fa1f7 in raise () from sysroot/lib64/libc.so.6
#1  0x00007fb03e1fb8e8 in abort () from sysroot/lib64/libc.so.6
#2  0x00007fb041159185 in os::abort(bool) () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
#3  0x00007fb0412fb593 in VMError::report_and_die() () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
#4  0x00007fb04115e68f in JVM_handle_linux_signal () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
#5  0x00007fb041154be3 in signalHandler(int, siginfo*, void*) () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
#6  <signal handler called>
#7  0x00000000048d0a53 in tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::FreeList*, unsigned long, int) ()
#8  0x00000000048d0aec in tcmalloc::ThreadCache::ListTooLong(tcmalloc::ThreadCache::FreeList*, unsigned long) ()
#9  0x0000000004a0b4c0 in tc_free ()
#10 0x00007fb040d32933 in ElfDecoder::demangle(char const*, char*, int) () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
#11 0x00007fb040d3222a in Decoder::demangle(char const*, char*, int) () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
#12 0x00007fb04115695d in os::dll_address_to_function_name(unsigned char*, char*, int, int*) () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
#13 0x00007fb040dc0222 in frame::print_C_frame(outputStream*, char*, int, unsigned char*) () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
#14 0x00007fb040d2e925 in print_native_stack(outputStream*, frame, Thread*, char*, int) () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
#15 0x00007fb0412f9cc8 in VMError::report(outputStream*) () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
#16 0x00007fb0412fb18a in VMError::report_and_die() () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
#17 0x00007fb04115e68f in JVM_handle_linux_signal () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
#18 0x00007fb041154be3 in signalHandler(int, siginfo*, void*) () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
#19 <signal handler called>
#20 0x00000000048d0a53 in tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::FreeList*, unsigned long, int) ()
#21 0x00000000048d0aec in tcmalloc::ThreadCache::ListTooLong(tcmalloc::ThreadCache::FreeList*, unsigned long) ()
#22 0x0000000004a0b4c0 in tc_free ()
#23 0x00007fb03e5915dd in pthread_attr_destroy () from sysroot/lib64/libpthread.so.0
#24 0x00007fb04115e49f in current_stack_region(unsigned char**, unsigned long*) () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
#25 0x00007fb04115e535 in os::current_stack_base() () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
#26 0x00007fb0412faeb4 in VMError::report(outputStream*) () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
#27 0x00007fb0412fb18a in VMError::report_and_die() () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
#28 0x00007fb04115e68f in JVM_handle_linux_signal () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
#29 0x00007fb041154be3 in signalHandler(int, siginfo*, void*) () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
#30 <signal handler called>
#31 0x00000000048d0a53 in tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::FreeList*, unsigned long, int) ()
#32 0x00000000048d0aec in tcmalloc::ThreadCache::ListTooLong(tcmalloc::ThreadCache::FreeList*, unsigned long) ()
#33 0x0000000004a0b4c0 in tc_free ()
#34 0x00007fb03f051720 in profile_iterator_free () from sysroot/lib64/libkrb5.so.3
#35 0x00007fb03f0519a4 in profile_get_value () from sysroot/lib64/libkrb5.so.3
#36 0x00007fb03f051a18 in profile_get_string () from sysroot/lib64/libkrb5.so.3
#37 0x00007fb03f044dde in profile_default_realm () from sysroot/lib64/libkrb5.so.3
#38 0x00007fb03f044509 in krb5_get_default_realm () from sysroot/lib64/libkrb5.so.3
#39 0x00007fb03f0245e8 in krb5_parse_name_flags () from sysroot/lib64/libkrb5.so.3
#40 0x0000000001ff7bbf in kudu::security::CanonicalizeKrb5Principal(std::string*) ()
#41 0x00000000026ee4df in kudu::rpc::ServerNegotiation::AuthenticateBySasl(kudu::faststring*) ()
#42 0x00000000026ea929 in kudu::rpc::ServerNegotiation::Negotiate() ()
#43 0x000000000271035b in kudu::rpc::DoServerNegotiation(kudu::rpc::Connection*, kudu::TriStateFlag, kudu::TriStateFlag, kudu::MonoTime const&) ()
#44 0x000000000271070d in kudu::rpc::Negotiation::RunNegotiation(scoped_refptr<kudu::rpc::Connection> const&, kudu::TriStateFlag, kudu::TriStateFlag, kudu::MonoTime) ()
#45 0x00000000026ca8ab in kudu::internal::RunnableAdapter<void (*)(scoped_refptr<kudu::rpc::Connection> const&, kudu::TriStateFlag, kudu::TriStateFlag, kudu::MonoTime)>::Run(scoped_refptr<kudu::rpc::Connection> const&, kudu::TriStateFlag const&, kudu::TriStateFlag const&, kudu::MonoTime const&) ()
#46 0x00000000026c9bf4 in kudu::internal::InvokeHelper<false, void, kudu::internal::RunnableAdapter<void (*)(scoped_refptr<kudu::rpc::Connection> const&, kudu::TriStateFlag, kudu::TriStateFlag, ku---Type <return> to continue, or q <return> to quit---
du::MonoTime)>, void (kudu::rpc::Connection*, kudu::TriStateFlag const&, kudu::TriStateFlag const&, kudu::MonoTime const&)>::MakeItSo(kudu::internal::RunnableAdapter<void (*)(scoped_refptr<kudu::rpc::Connection> const&, kudu::TriStateFlag, kudu::TriStateFlag, kudu::MonoTime)>, kudu::rpc::Connection*, kudu::TriStateFlag const&, kudu::TriStateFlag const&, kudu::MonoTime const&) ()
#47 0x00000000026c8ad3 in kudu::internal::Invoker<4, kudu::internal::BindState<kudu::internal::RunnableAdapter<void (*)(scoped_refptr<kudu::rpc::Connection> const&, kudu::TriStateFlag, kudu::TriStateFlag, kudu::MonoTime)>, void (scoped_refptr<kudu::rpc::Connection> const&, kudu::TriStateFlag, kudu::TriStateFlag, kudu::MonoTime), void (scoped_refptr<kudu::rpc::Connection>, kudu::TriStateFlag, kudu::TriStateFlag, kudu::MonoTime)>, void (scoped_refptr<kudu::rpc::Connection> const&, kudu::TriStateFlag, kudu::TriStateFlag, kudu::MonoTime)>::Run(kudu::internal::BindStateBase*) ()
#48 0x0000000001dae84c in kudu::Callback<void ()>::Run() const ()
#49 0x000000000295a66a in kudu::ClosureRunnable::Run() ()
#50 0x00000000029595fd in kudu::ThreadPool::DispatchThread() ()
#51 0x00000000029650d5 in boost::_mfi::mf0<void, kudu::ThreadPool>::operator()(kudu::ThreadPool*) const ()
#52 0x0000000002964602 in void boost::_bi::list1<boost::_bi::value<kudu::ThreadPool*> >::operator()<boost::_mfi::mf0<void, kudu::ThreadPool>, boost::_bi::list0>(boost::_bi::type<void>, boost::_mfi::mf0<void, kudu::ThreadPool>&, boost::_bi::list0&, int) ()
#53 0x0000000002963a05 in boost::_bi::bind_t<void, boost::_mfi::mf0<void, kudu::ThreadPool>, boost::_bi::list1<boost::_bi::value<kudu::ThreadPool*> > >::operator()() ()
#54 0x0000000002962b61 in boost::detail::function::void_function_obj_invoker0<boost::_bi::bind_t<void, boost::_mfi::mf0<void, kudu::ThreadPool>, boost::_bi::list1<boost::_bi::value<kudu::ThreadPool*> > >, void>::invoke(boost::detail::function::function_buffer&) ()
#55 0x0000000001d76514 in boost::function0<void>::operator()() const ()
#56 0x0000000001d72da2 in kudu::Thread::SuperviseThread(void*) ()
#57 0x00007fb03e58fe25 in start_thread () from sysroot/lib64/libpthread.so.0
#58 0x00007fb03e2bd34d in clone () from sysroot/lib64/libc.so.6
{noformat}

[~tlipcon] kindly pointed out that someone reported similar issue in Kerberos upstream in the past (http://krbdev.mit.edu/rt/Ticket/Display.html?id=2855).

  was:
As far as I understand, the assumption is that {{g_krb5_ctx}} is global, shared and it should not be modified after initialization. However, various code in {{kudu::security}} make calls to some Kerberos code which may make modification to {{g_krb5_ctx}} inadvertently. 

The default initialization code {{krb5_init_context(&g_krb5_ctx)}} called by {{kudu::security:: InitKrb5Ctx()}} only sets {{g_krb5_ctx->default_realm}} to 0. Upon the first call to {{krb5_parse_name()}}, the Kerberos library will call {{krb5_get_default_realm()}} to get the default realm as realm is {{NULL}}:

{noformat}
krb5_error_code KRB5_CALLCONV
krb5_get_default_realm(krb5_context context, char **realm_out)
{
    krb5_error_code ret;

    *realm_out = NULL;

    if (context == NULL || context->magic != KV5M_CONTEXT)
        return KV5M_CONTEXT;

    if (context->default_realm == NULL) {
        ret = get_default_realm(context, &context->default_realm); <<<----- // non-thread safe call
        if (ret)
            return ret;
    }
    *realm_out = strdup(context->default_realm);
    return (*realm_out == NULL) ? ENOMEM : 0;
}
{noformat}

Apparently, {{krb5_get_default_realm}} may modify {{g_krb5_context}} but it's not thread safe. So, if multiple negotiation threads get into the same code path of calling {{krb5_get_default_realm()}} at the same time, they may end up stepping on each other and corrupting {{g_krb5_ctx}}, leading to the crash seen in stack trace below or error messages like the following:

{noformat}
0216 14:26:07.459600 (+   296us) negotiation.cc:304] Negotiation complete: Runtime error: Server connection negotiation failed: server connection from X.X.X.X:37070: could not canonicalize krb5 principal: could not parse principal: Configuration file does not specify default realm
{noformat}

Stack trace showing 
{noformat}
#0  0x00007fb03e1fa1f7 in raise () from sysroot/lib64/libc.so.6
#1  0x00007fb03e1fb8e8 in abort () from sysroot/lib64/libc.so.6
#2  0x00007fb041159185 in os::abort(bool) () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
#3  0x00007fb0412fb593 in VMError::report_and_die() () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
#4  0x00007fb04115e68f in JVM_handle_linux_signal () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
#5  0x00007fb041154be3 in signalHandler(int, siginfo*, void*) () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
#6  <signal handler called>
#7  0x00000000048d0a53 in tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::FreeList*, unsigned long, int) ()
#8  0x00000000048d0aec in tcmalloc::ThreadCache::ListTooLong(tcmalloc::ThreadCache::FreeList*, unsigned long) ()
#9  0x0000000004a0b4c0 in tc_free ()
#10 0x00007fb040d32933 in ElfDecoder::demangle(char const*, char*, int) () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
#11 0x00007fb040d3222a in Decoder::demangle(char const*, char*, int) () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
#12 0x00007fb04115695d in os::dll_address_to_function_name(unsigned char*, char*, int, int*) () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
#13 0x00007fb040dc0222 in frame::print_C_frame(outputStream*, char*, int, unsigned char*) () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
#14 0x00007fb040d2e925 in print_native_stack(outputStream*, frame, Thread*, char*, int) () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
#15 0x00007fb0412f9cc8 in VMError::report(outputStream*) () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
#16 0x00007fb0412fb18a in VMError::report_and_die() () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
#17 0x00007fb04115e68f in JVM_handle_linux_signal () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
#18 0x00007fb041154be3 in signalHandler(int, siginfo*, void*) () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
#19 <signal handler called>
#20 0x00000000048d0a53 in tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::FreeList*, unsigned long, int) ()
#21 0x00000000048d0aec in tcmalloc::ThreadCache::ListTooLong(tcmalloc::ThreadCache::FreeList*, unsigned long) ()
#22 0x0000000004a0b4c0 in tc_free ()
#23 0x00007fb03e5915dd in pthread_attr_destroy () from sysroot/lib64/libpthread.so.0
#24 0x00007fb04115e49f in current_stack_region(unsigned char**, unsigned long*) () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
#25 0x00007fb04115e535 in os::current_stack_base() () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
#26 0x00007fb0412faeb4 in VMError::report(outputStream*) () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
#27 0x00007fb0412fb18a in VMError::report_and_die() () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
#28 0x00007fb04115e68f in JVM_handle_linux_signal () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
#29 0x00007fb041154be3 in signalHandler(int, siginfo*, void*) () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
#30 <signal handler called>
#31 0x00000000048d0a53 in tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::FreeList*, unsigned long, int) ()
#32 0x00000000048d0aec in tcmalloc::ThreadCache::ListTooLong(tcmalloc::ThreadCache::FreeList*, unsigned long) ()
#33 0x0000000004a0b4c0 in tc_free ()
#34 0x00007fb03f051720 in profile_iterator_free () from sysroot/lib64/libkrb5.so.3
#35 0x00007fb03f0519a4 in profile_get_value () from sysroot/lib64/libkrb5.so.3
#36 0x00007fb03f051a18 in profile_get_string () from sysroot/lib64/libkrb5.so.3
#37 0x00007fb03f044dde in profile_default_realm () from sysroot/lib64/libkrb5.so.3
#38 0x00007fb03f044509 in krb5_get_default_realm () from sysroot/lib64/libkrb5.so.3
#39 0x00007fb03f0245e8 in krb5_parse_name_flags () from sysroot/lib64/libkrb5.so.3
#40 0x0000000001ff7bbf in kudu::security::CanonicalizeKrb5Principal(std::string*) ()
#41 0x00000000026ee4df in kudu::rpc::ServerNegotiation::AuthenticateBySasl(kudu::faststring*) ()
#42 0x00000000026ea929 in kudu::rpc::ServerNegotiation::Negotiate() ()
#43 0x000000000271035b in kudu::rpc::DoServerNegotiation(kudu::rpc::Connection*, kudu::TriStateFlag, kudu::TriStateFlag, kudu::MonoTime const&) ()
#44 0x000000000271070d in kudu::rpc::Negotiation::RunNegotiation(scoped_refptr<kudu::rpc::Connection> const&, kudu::TriStateFlag, kudu::TriStateFlag, kudu::MonoTime) ()
#45 0x00000000026ca8ab in kudu::internal::RunnableAdapter<void (*)(scoped_refptr<kudu::rpc::Connection> const&, kudu::TriStateFlag, kudu::TriStateFlag, kudu::MonoTime)>::Run(scoped_refptr<kudu::rpc::Connection> const&, kudu::TriStateFlag const&, kudu::TriStateFlag const&, kudu::MonoTime const&) ()
#46 0x00000000026c9bf4 in kudu::internal::InvokeHelper<false, void, kudu::internal::RunnableAdapter<void (*)(scoped_refptr<kudu::rpc::Connection> const&, kudu::TriStateFlag, kudu::TriStateFlag, ku---Type <return> to continue, or q <return> to quit---
du::MonoTime)>, void (kudu::rpc::Connection*, kudu::TriStateFlag const&, kudu::TriStateFlag const&, kudu::MonoTime const&)>::MakeItSo(kudu::internal::RunnableAdapter<void (*)(scoped_refptr<kudu::rpc::Connection> const&, kudu::TriStateFlag, kudu::TriStateFlag, kudu::MonoTime)>, kudu::rpc::Connection*, kudu::TriStateFlag const&, kudu::TriStateFlag const&, kudu::MonoTime const&) ()
#47 0x00000000026c8ad3 in kudu::internal::Invoker<4, kudu::internal::BindState<kudu::internal::RunnableAdapter<void (*)(scoped_refptr<kudu::rpc::Connection> const&, kudu::TriStateFlag, kudu::TriStateFlag, kudu::MonoTime)>, void (scoped_refptr<kudu::rpc::Connection> const&, kudu::TriStateFlag, kudu::TriStateFlag, kudu::MonoTime), void (scoped_refptr<kudu::rpc::Connection>, kudu::TriStateFlag, kudu::TriStateFlag, kudu::MonoTime)>, void (scoped_refptr<kudu::rpc::Connection> const&, kudu::TriStateFlag, kudu::TriStateFlag, kudu::MonoTime)>::Run(kudu::internal::BindStateBase*) ()
#48 0x0000000001dae84c in kudu::Callback<void ()>::Run() const ()
#49 0x000000000295a66a in kudu::ClosureRunnable::Run() ()
#50 0x00000000029595fd in kudu::ThreadPool::DispatchThread() ()
#51 0x00000000029650d5 in boost::_mfi::mf0<void, kudu::ThreadPool>::operator()(kudu::ThreadPool*) const ()
#52 0x0000000002964602 in void boost::_bi::list1<boost::_bi::value<kudu::ThreadPool*> >::operator()<boost::_mfi::mf0<void, kudu::ThreadPool>, boost::_bi::list0>(boost::_bi::type<void>, boost::_mfi::mf0<void, kudu::ThreadPool>&, boost::_bi::list0&, int) ()
#53 0x0000000002963a05 in boost::_bi::bind_t<void, boost::_mfi::mf0<void, kudu::ThreadPool>, boost::_bi::list1<boost::_bi::value<kudu::ThreadPool*> > >::operator()() ()
#54 0x0000000002962b61 in boost::detail::function::void_function_obj_invoker0<boost::_bi::bind_t<void, boost::_mfi::mf0<void, kudu::ThreadPool>, boost::_bi::list1<boost::_bi::value<kudu::ThreadPool*> > >, void>::invoke(boost::detail::function::function_buffer&) ()
#55 0x0000000001d76514 in boost::function0<void>::operator()() const ()
#56 0x0000000001d72da2 in kudu::Thread::SuperviseThread(void*) ()
#57 0x00007fb03e58fe25 in start_thread () from sysroot/lib64/libpthread.so.0
#58 0x00007fb03e2bd34d in clone () from sysroot/lib64/libc.so.6
{noformat}

[~tlipcon] kindly pointed out that someone reported similar issue in Kerberos upstream in the past (http://krbdev.mit.edu/rt/Ticket/Display.html?id=2855).


> Race in CanonicalizeKrb5Principal() due to lazy initialization of g_kinit_ctx->default_realm
> --------------------------------------------------------------------------------------------
>
>                 Key: KUDU-2706
>                 URL: https://issues.apache.org/jira/browse/KUDU-2706
>             Project: Kudu
>          Issue Type: Bug
>          Components: security
>    Affects Versions: 1.8.0
>            Reporter: Michael Ho
>            Assignee: Michael Ho
>            Priority: Critical
>
> As far as I understand, the assumption is that {{g_krb5_ctx}} is global, shared and it should not be modified after initialization. However, various code in {{kudu::security}} make calls to some Kerberos code which may make modification to {{g_krb5_ctx}} inadvertently. 
> The default initialization code {{krb5_init_context(&g_krb5_ctx)}} called by {{kudu::security::InitKrb5Ctx()}} only sets {{g_krb5_ctx->default_realm}} to 0. Upon the first call to {{krb5_parse_name()}}, the Kerberos library will call {{krb5_get_default_realm()}} to get the default realm as realm is {{NULL}}:
> {noformat}
> krb5_error_code KRB5_CALLCONV
> krb5_get_default_realm(krb5_context context, char **realm_out)
> {
>     krb5_error_code ret;
>     *realm_out = NULL;
>     if (context == NULL || context->magic != KV5M_CONTEXT)
>         return KV5M_CONTEXT;
>     if (context->default_realm == NULL) {
>         ret = get_default_realm(context, &context->default_realm); <<<----- // non-thread safe call
>         if (ret)
>             return ret;
>     }
>     *realm_out = strdup(context->default_realm);
>     return (*realm_out == NULL) ? ENOMEM : 0;
> }
> {noformat}
> Apparently, {{krb5_get_default_realm}} may modify {{g_krb5_context}} but it's not thread safe. So, if multiple negotiation threads get into the same code path of calling {{krb5_get_default_realm()}} at the same time, they may end up stepping on each other and corrupting {{g_krb5_ctx}}, leading to the crash seen in stack trace below or error messages like the following:
> {noformat}
> 0216 14:26:07.459600 (+   296us) negotiation.cc:304] Negotiation complete: Runtime error: Server connection negotiation failed: server connection from X.X.X.X:37070: could not canonicalize krb5 principal: could not parse principal: Configuration file does not specify default realm
> {noformat}
> Stack trace showing 
> {noformat}
> #0  0x00007fb03e1fa1f7 in raise () from sysroot/lib64/libc.so.6
> #1  0x00007fb03e1fb8e8 in abort () from sysroot/lib64/libc.so.6
> #2  0x00007fb041159185 in os::abort(bool) () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
> #3  0x00007fb0412fb593 in VMError::report_and_die() () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
> #4  0x00007fb04115e68f in JVM_handle_linux_signal () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
> #5  0x00007fb041154be3 in signalHandler(int, siginfo*, void*) () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
> #6  <signal handler called>
> #7  0x00000000048d0a53 in tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::FreeList*, unsigned long, int) ()
> #8  0x00000000048d0aec in tcmalloc::ThreadCache::ListTooLong(tcmalloc::ThreadCache::FreeList*, unsigned long) ()
> #9  0x0000000004a0b4c0 in tc_free ()
> #10 0x00007fb040d32933 in ElfDecoder::demangle(char const*, char*, int) () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
> #11 0x00007fb040d3222a in Decoder::demangle(char const*, char*, int) () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
> #12 0x00007fb04115695d in os::dll_address_to_function_name(unsigned char*, char*, int, int*) () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
> #13 0x00007fb040dc0222 in frame::print_C_frame(outputStream*, char*, int, unsigned char*) () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
> #14 0x00007fb040d2e925 in print_native_stack(outputStream*, frame, Thread*, char*, int) () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
> #15 0x00007fb0412f9cc8 in VMError::report(outputStream*) () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
> #16 0x00007fb0412fb18a in VMError::report_and_die() () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
> #17 0x00007fb04115e68f in JVM_handle_linux_signal () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
> #18 0x00007fb041154be3 in signalHandler(int, siginfo*, void*) () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
> #19 <signal handler called>
> #20 0x00000000048d0a53 in tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::FreeList*, unsigned long, int) ()
> #21 0x00000000048d0aec in tcmalloc::ThreadCache::ListTooLong(tcmalloc::ThreadCache::FreeList*, unsigned long) ()
> #22 0x0000000004a0b4c0 in tc_free ()
> #23 0x00007fb03e5915dd in pthread_attr_destroy () from sysroot/lib64/libpthread.so.0
> #24 0x00007fb04115e49f in current_stack_region(unsigned char**, unsigned long*) () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
> #25 0x00007fb04115e535 in os::current_stack_base() () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
> #26 0x00007fb0412faeb4 in VMError::report(outputStream*) () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
> #27 0x00007fb0412fb18a in VMError::report_and_die() () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
> #28 0x00007fb04115e68f in JVM_handle_linux_signal () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
> #29 0x00007fb041154be3 in signalHandler(int, siginfo*, void*) () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
> #30 <signal handler called>
> #31 0x00000000048d0a53 in tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::FreeList*, unsigned long, int) ()
> #32 0x00000000048d0aec in tcmalloc::ThreadCache::ListTooLong(tcmalloc::ThreadCache::FreeList*, unsigned long) ()
> #33 0x0000000004a0b4c0 in tc_free ()
> #34 0x00007fb03f051720 in profile_iterator_free () from sysroot/lib64/libkrb5.so.3
> #35 0x00007fb03f0519a4 in profile_get_value () from sysroot/lib64/libkrb5.so.3
> #36 0x00007fb03f051a18 in profile_get_string () from sysroot/lib64/libkrb5.so.3
> #37 0x00007fb03f044dde in profile_default_realm () from sysroot/lib64/libkrb5.so.3
> #38 0x00007fb03f044509 in krb5_get_default_realm () from sysroot/lib64/libkrb5.so.3
> #39 0x00007fb03f0245e8 in krb5_parse_name_flags () from sysroot/lib64/libkrb5.so.3
> #40 0x0000000001ff7bbf in kudu::security::CanonicalizeKrb5Principal(std::string*) ()
> #41 0x00000000026ee4df in kudu::rpc::ServerNegotiation::AuthenticateBySasl(kudu::faststring*) ()
> #42 0x00000000026ea929 in kudu::rpc::ServerNegotiation::Negotiate() ()
> #43 0x000000000271035b in kudu::rpc::DoServerNegotiation(kudu::rpc::Connection*, kudu::TriStateFlag, kudu::TriStateFlag, kudu::MonoTime const&) ()
> #44 0x000000000271070d in kudu::rpc::Negotiation::RunNegotiation(scoped_refptr<kudu::rpc::Connection> const&, kudu::TriStateFlag, kudu::TriStateFlag, kudu::MonoTime) ()
> #45 0x00000000026ca8ab in kudu::internal::RunnableAdapter<void (*)(scoped_refptr<kudu::rpc::Connection> const&, kudu::TriStateFlag, kudu::TriStateFlag, kudu::MonoTime)>::Run(scoped_refptr<kudu::rpc::Connection> const&, kudu::TriStateFlag const&, kudu::TriStateFlag const&, kudu::MonoTime const&) ()
> #46 0x00000000026c9bf4 in kudu::internal::InvokeHelper<false, void, kudu::internal::RunnableAdapter<void (*)(scoped_refptr<kudu::rpc::Connection> const&, kudu::TriStateFlag, kudu::TriStateFlag, ku---Type <return> to continue, or q <return> to quit---
> du::MonoTime)>, void (kudu::rpc::Connection*, kudu::TriStateFlag const&, kudu::TriStateFlag const&, kudu::MonoTime const&)>::MakeItSo(kudu::internal::RunnableAdapter<void (*)(scoped_refptr<kudu::rpc::Connection> const&, kudu::TriStateFlag, kudu::TriStateFlag, kudu::MonoTime)>, kudu::rpc::Connection*, kudu::TriStateFlag const&, kudu::TriStateFlag const&, kudu::MonoTime const&) ()
> #47 0x00000000026c8ad3 in kudu::internal::Invoker<4, kudu::internal::BindState<kudu::internal::RunnableAdapter<void (*)(scoped_refptr<kudu::rpc::Connection> const&, kudu::TriStateFlag, kudu::TriStateFlag, kudu::MonoTime)>, void (scoped_refptr<kudu::rpc::Connection> const&, kudu::TriStateFlag, kudu::TriStateFlag, kudu::MonoTime), void (scoped_refptr<kudu::rpc::Connection>, kudu::TriStateFlag, kudu::TriStateFlag, kudu::MonoTime)>, void (scoped_refptr<kudu::rpc::Connection> const&, kudu::TriStateFlag, kudu::TriStateFlag, kudu::MonoTime)>::Run(kudu::internal::BindStateBase*) ()
> #48 0x0000000001dae84c in kudu::Callback<void ()>::Run() const ()
> #49 0x000000000295a66a in kudu::ClosureRunnable::Run() ()
> #50 0x00000000029595fd in kudu::ThreadPool::DispatchThread() ()
> #51 0x00000000029650d5 in boost::_mfi::mf0<void, kudu::ThreadPool>::operator()(kudu::ThreadPool*) const ()
> #52 0x0000000002964602 in void boost::_bi::list1<boost::_bi::value<kudu::ThreadPool*> >::operator()<boost::_mfi::mf0<void, kudu::ThreadPool>, boost::_bi::list0>(boost::_bi::type<void>, boost::_mfi::mf0<void, kudu::ThreadPool>&, boost::_bi::list0&, int) ()
> #53 0x0000000002963a05 in boost::_bi::bind_t<void, boost::_mfi::mf0<void, kudu::ThreadPool>, boost::_bi::list1<boost::_bi::value<kudu::ThreadPool*> > >::operator()() ()
> #54 0x0000000002962b61 in boost::detail::function::void_function_obj_invoker0<boost::_bi::bind_t<void, boost::_mfi::mf0<void, kudu::ThreadPool>, boost::_bi::list1<boost::_bi::value<kudu::ThreadPool*> > >, void>::invoke(boost::detail::function::function_buffer&) ()
> #55 0x0000000001d76514 in boost::function0<void>::operator()() const ()
> #56 0x0000000001d72da2 in kudu::Thread::SuperviseThread(void*) ()
> #57 0x00007fb03e58fe25 in start_thread () from sysroot/lib64/libpthread.so.0
> #58 0x00007fb03e2bd34d in clone () from sysroot/lib64/libc.so.6
> {noformat}
> [~tlipcon] kindly pointed out that someone reported similar issue in Kerberos upstream in the past (http://krbdev.mit.edu/rt/Ticket/Display.html?id=2855).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)