You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Andrew Wong (JIRA)" <ji...@apache.org> on 2019/03/05 19:44:00 UTC

[jira] [Resolved] (KUDU-2706) Race in CanonicalizeKrb5Principal() due to lazy initialization of g_kinit_ctx->default_realm

     [ https://issues.apache.org/jira/browse/KUDU-2706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Wong resolved KUDU-2706.
-------------------------------
       Resolution: Fixed
    Fix Version/s: 1.9.0

Michael added a workaround to this in 25af98eaf4c712bef9033721ea58b3f0d0a78c32.

> Race in CanonicalizeKrb5Principal() due to lazy initialization of g_kinit_ctx->default_realm
> --------------------------------------------------------------------------------------------
>
>                 Key: KUDU-2706
>                 URL: https://issues.apache.org/jira/browse/KUDU-2706
>             Project: Kudu
>          Issue Type: Bug
>          Components: security
>    Affects Versions: 1.8.0
>            Reporter: Michael Ho
>            Assignee: Michael Ho
>            Priority: Critical
>             Fix For: 1.9.0
>
>
> As far as I understand, the assumption is that {{g_krb5_ctx}} is global, shared and it should not be modified after initialization. However, various code in {{kudu::security}} make calls to some Kerberos code which may make modification to {{g_krb5_ctx}} inadvertently. 
> The default initialization code {{krb5_init_context(&g_krb5_ctx)}} called by {{kudu::security::InitKrb5Ctx()}} only sets {{g_krb5_ctx->default_realm}} to 0. Upon the first call to {{krb5_parse_name()}}, the Kerberos library will call {{krb5_get_default_realm()}} to get the default realm as realm is {{NULL}}:
> {noformat}
> krb5_error_code KRB5_CALLCONV
> krb5_get_default_realm(krb5_context context, char **realm_out)
> {
>     krb5_error_code ret;
>     *realm_out = NULL;
>     if (context == NULL || context->magic != KV5M_CONTEXT)
>         return KV5M_CONTEXT;
>     if (context->default_realm == NULL) {
>         ret = get_default_realm(context, &context->default_realm); <<<----- // non-thread safe call
>         if (ret)
>             return ret;
>     }
>     *realm_out = strdup(context->default_realm);
>     return (*realm_out == NULL) ? ENOMEM : 0;
> }
> {noformat}
> Apparently, {{krb5_get_default_realm}} may modify {{g_krb5_context}} but it's not thread safe. So, if multiple negotiation threads get into the same code path of calling {{krb5_get_default_realm()}} at the same time, they may end up stepping on each other and corrupting {{g_krb5_ctx}}, leading to the crash seen in stack trace below or error messages like the following:
> {noformat}
> 0216 14:26:07.459600 (+   296us) negotiation.cc:304] Negotiation complete: Runtime error: Server connection negotiation failed: server connection from X.X.X.X:37070: could not canonicalize krb5 principal: could not parse principal: Configuration file does not specify default realm
> {noformat}
> Stack trace showing 
> {noformat}
> #0  0x00007fb03e1fa1f7 in raise () from sysroot/lib64/libc.so.6
> #1  0x00007fb03e1fb8e8 in abort () from sysroot/lib64/libc.so.6
> #2  0x00007fb041159185 in os::abort(bool) () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
> #3  0x00007fb0412fb593 in VMError::report_and_die() () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
> #4  0x00007fb04115e68f in JVM_handle_linux_signal () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
> #5  0x00007fb041154be3 in signalHandler(int, siginfo*, void*) () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
> #6  <signal handler called>
> #7  0x00000000048d0a53 in tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::FreeList*, unsigned long, int) ()
> #8  0x00000000048d0aec in tcmalloc::ThreadCache::ListTooLong(tcmalloc::ThreadCache::FreeList*, unsigned long) ()
> #9  0x0000000004a0b4c0 in tc_free ()
> #10 0x00007fb040d32933 in ElfDecoder::demangle(char const*, char*, int) () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
> #11 0x00007fb040d3222a in Decoder::demangle(char const*, char*, int) () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
> #12 0x00007fb04115695d in os::dll_address_to_function_name(unsigned char*, char*, int, int*) () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
> #13 0x00007fb040dc0222 in frame::print_C_frame(outputStream*, char*, int, unsigned char*) () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
> #14 0x00007fb040d2e925 in print_native_stack(outputStream*, frame, Thread*, char*, int) () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
> #15 0x00007fb0412f9cc8 in VMError::report(outputStream*) () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
> #16 0x00007fb0412fb18a in VMError::report_and_die() () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
> #17 0x00007fb04115e68f in JVM_handle_linux_signal () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
> #18 0x00007fb041154be3 in signalHandler(int, siginfo*, void*) () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
> #19 <signal handler called>
> #20 0x00000000048d0a53 in tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::FreeList*, unsigned long, int) ()
> #21 0x00000000048d0aec in tcmalloc::ThreadCache::ListTooLong(tcmalloc::ThreadCache::FreeList*, unsigned long) ()
> #22 0x0000000004a0b4c0 in tc_free ()
> #23 0x00007fb03e5915dd in pthread_attr_destroy () from sysroot/lib64/libpthread.so.0
> #24 0x00007fb04115e49f in current_stack_region(unsigned char**, unsigned long*) () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
> #25 0x00007fb04115e535 in os::current_stack_base() () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
> #26 0x00007fb0412faeb4 in VMError::report(outputStream*) () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
> #27 0x00007fb0412fb18a in VMError::report_and_die() () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
> #28 0x00007fb04115e68f in JVM_handle_linux_signal () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
> #29 0x00007fb041154be3 in signalHandler(int, siginfo*, void*) () from sysroot/usr/java/jdk1.8.0_141-cloudera/jre/lib/amd64/server/libjvm.so
> #30 <signal handler called>
> #31 0x00000000048d0a53 in tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::FreeList*, unsigned long, int) ()
> #32 0x00000000048d0aec in tcmalloc::ThreadCache::ListTooLong(tcmalloc::ThreadCache::FreeList*, unsigned long) ()
> #33 0x0000000004a0b4c0 in tc_free ()
> #34 0x00007fb03f051720 in profile_iterator_free () from sysroot/lib64/libkrb5.so.3
> #35 0x00007fb03f0519a4 in profile_get_value () from sysroot/lib64/libkrb5.so.3
> #36 0x00007fb03f051a18 in profile_get_string () from sysroot/lib64/libkrb5.so.3
> #37 0x00007fb03f044dde in profile_default_realm () from sysroot/lib64/libkrb5.so.3
> #38 0x00007fb03f044509 in krb5_get_default_realm () from sysroot/lib64/libkrb5.so.3
> #39 0x00007fb03f0245e8 in krb5_parse_name_flags () from sysroot/lib64/libkrb5.so.3
> #40 0x0000000001ff7bbf in kudu::security::CanonicalizeKrb5Principal(std::string*) ()
> #41 0x00000000026ee4df in kudu::rpc::ServerNegotiation::AuthenticateBySasl(kudu::faststring*) ()
> #42 0x00000000026ea929 in kudu::rpc::ServerNegotiation::Negotiate() ()
> #43 0x000000000271035b in kudu::rpc::DoServerNegotiation(kudu::rpc::Connection*, kudu::TriStateFlag, kudu::TriStateFlag, kudu::MonoTime const&) ()
> #44 0x000000000271070d in kudu::rpc::Negotiation::RunNegotiation(scoped_refptr<kudu::rpc::Connection> const&, kudu::TriStateFlag, kudu::TriStateFlag, kudu::MonoTime) ()
> #45 0x00000000026ca8ab in kudu::internal::RunnableAdapter<void (*)(scoped_refptr<kudu::rpc::Connection> const&, kudu::TriStateFlag, kudu::TriStateFlag, kudu::MonoTime)>::Run(scoped_refptr<kudu::rpc::Connection> const&, kudu::TriStateFlag const&, kudu::TriStateFlag const&, kudu::MonoTime const&) ()
> #46 0x00000000026c9bf4 in kudu::internal::InvokeHelper<false, void, kudu::internal::RunnableAdapter<void (*)(scoped_refptr<kudu::rpc::Connection> const&, kudu::TriStateFlag, kudu::TriStateFlag, ku---Type <return> to continue, or q <return> to quit---
> du::MonoTime)>, void (kudu::rpc::Connection*, kudu::TriStateFlag const&, kudu::TriStateFlag const&, kudu::MonoTime const&)>::MakeItSo(kudu::internal::RunnableAdapter<void (*)(scoped_refptr<kudu::rpc::Connection> const&, kudu::TriStateFlag, kudu::TriStateFlag, kudu::MonoTime)>, kudu::rpc::Connection*, kudu::TriStateFlag const&, kudu::TriStateFlag const&, kudu::MonoTime const&) ()
> #47 0x00000000026c8ad3 in kudu::internal::Invoker<4, kudu::internal::BindState<kudu::internal::RunnableAdapter<void (*)(scoped_refptr<kudu::rpc::Connection> const&, kudu::TriStateFlag, kudu::TriStateFlag, kudu::MonoTime)>, void (scoped_refptr<kudu::rpc::Connection> const&, kudu::TriStateFlag, kudu::TriStateFlag, kudu::MonoTime), void (scoped_refptr<kudu::rpc::Connection>, kudu::TriStateFlag, kudu::TriStateFlag, kudu::MonoTime)>, void (scoped_refptr<kudu::rpc::Connection> const&, kudu::TriStateFlag, kudu::TriStateFlag, kudu::MonoTime)>::Run(kudu::internal::BindStateBase*) ()
> #48 0x0000000001dae84c in kudu::Callback<void ()>::Run() const ()
> #49 0x000000000295a66a in kudu::ClosureRunnable::Run() ()
> #50 0x00000000029595fd in kudu::ThreadPool::DispatchThread() ()
> #51 0x00000000029650d5 in boost::_mfi::mf0<void, kudu::ThreadPool>::operator()(kudu::ThreadPool*) const ()
> #52 0x0000000002964602 in void boost::_bi::list1<boost::_bi::value<kudu::ThreadPool*> >::operator()<boost::_mfi::mf0<void, kudu::ThreadPool>, boost::_bi::list0>(boost::_bi::type<void>, boost::_mfi::mf0<void, kudu::ThreadPool>&, boost::_bi::list0&, int) ()
> #53 0x0000000002963a05 in boost::_bi::bind_t<void, boost::_mfi::mf0<void, kudu::ThreadPool>, boost::_bi::list1<boost::_bi::value<kudu::ThreadPool*> > >::operator()() ()
> #54 0x0000000002962b61 in boost::detail::function::void_function_obj_invoker0<boost::_bi::bind_t<void, boost::_mfi::mf0<void, kudu::ThreadPool>, boost::_bi::list1<boost::_bi::value<kudu::ThreadPool*> > >, void>::invoke(boost::detail::function::function_buffer&) ()
> #55 0x0000000001d76514 in boost::function0<void>::operator()() const ()
> #56 0x0000000001d72da2 in kudu::Thread::SuperviseThread(void*) ()
> #57 0x00007fb03e58fe25 in start_thread () from sysroot/lib64/libpthread.so.0
> #58 0x00007fb03e2bd34d in clone () from sysroot/lib64/libc.so.6
> {noformat}
> [~tlipcon] kindly pointed out that someone reported similar issue in Kerberos upstream in the past (http://krbdev.mit.edu/rt/Ticket/Display.html?id=2855).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)