You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kudu.apache.org by Alexey Serbin <as...@cloudera.com.INVALID> on 2019/06/20 08:08:56 UTC

Re: java unit tests failing when starting MiniKuduCluster

Hi Clemens,

I apologize for dropping the ball on this.

I tried to look at the issue some time ago, but failed with provisioning
Fedora 29 VM, and it slipped away.

Just recently, Todd and I looked at the issue in the context of failing
Kudu tests on Ubuntu 18.04 LTS (bionic).  It turned out the issue is
related to TLSv1.3 handshake changes.  Kudu code didn't take into account
the specifics of TLSv1.3 handshake, and failed to handle it properly.  The
problem appears as soon as both the client and server support TLSv1.3 (that
starts with OpenSSL 1.1.1).

Long story short:
 * the issue is tracked by https://issues.apache.org/jira/browse/KUDU-2871
 * the workaround to use TLS not higher than v1.2 is in the master branch
of the Kudu repo already, see changelist efc3f372e
 * most likely, the workaround to use TLS not higher than v1.2 will be in
the upcoming release Kudu 1.10
 * hopefully, there soon will be a fix to adapt Kudu RPC code to work with
TLSv1.3


Kind regards,

Alexey


On Fri, May 3, 2019 at 12:59 AM Clemens Valiente <
Clemens.Valiente@trivago.com> wrote:

> Hi Alexey,
>
> thanks for looking into it.
>
> I'd like to share the result from tls_handshake-test since that seems to
> be getting closer to the root cause:
>
> [ RUN      ] TestTlsHandshake.TestHandshakeSequence
> /home/cvaliente/git/kudu/src/kudu/security/tls_handshake-test.cc:206:
> Failure
> Value of: client.Continue(buf1, &buf2).IsIncomplete()
>   Actual: false
> Expected: true
>
> it fails at this position:
>
>   // Client receives server Hello and sends client Finished
>   ASSERT_TRUE(client.Continue(buf1, &buf2).IsIncomplete());
>
>
> So for some reason the client believes it has not finished yet.
>
>
> Cheers
>
> Clemens
>
> ________________________________
> From: Alexey Serbin <as...@cloudera.com.INVALID>
> Sent: 19 April 2019 18:10:51
> To: dev
> Subject: Re: java unit tests failing when starting MiniKuduCluster
>
> Hi Clemens,
>
> Thank you for the information.  Yes, to me it looks like an issue with IO
> over TLS-encrypted channels.  I don't have access to Fedora29 and newer, so
> I didn't do any troubleshooting/debugging this time.  However, if you
> really want to run on Fedora29, you can disable authentication and
> encryption (--rpc_authentication=disabled --rpc_encryption=disabled) and it
> seems that should work fine on Fedora29.  Other options are running on
> Linux platforms that are currently listed at
>
> https://kudu.apache.org/docs/installation.html#_prerequisites_and_requirements
> or going forward with troubleshooting on your own, where good starting
> points might be tls_handshake-test and tls_socket-test in addition to
> negotiation-test.
>
> I'm also planning to take a look at the issue soon.  I'll let you know once
> I find anything of particular interest there.
>
>
> Kind regards,
>
> Alexey
>
> On Thu, Apr 18, 2019 at 2:14 AM Clemens Valiente <
> Clemens.Valiente@trivago.com> wrote:
>
> > Hi Alexey,
> >
> >
> > find the tests attached.
> >
> > The pattern that I can see is that the successful tests are those without
> > a signed certificate (server: {pki: NONE) and the ones with disabled
> > encryption on client- or serverside (encryption: DISABLED) - but those
> > last ones are not supposed to establish a successful connection anyway.
> >
> > The failed tests all require some signed key on the server's side and
> > encryption supported by both sides. So it might hint at an SSL issue.
> >
> > I am using a FIPS enabled version of openssl:
> >
> > OpenSSL 1.1.1b FIPS  26 Feb 2019
> >
> >
> > Cheers
> >
> > Clemens
> > ------------------------------
> > *From:* Alexey Serbin <as...@cloudera.com.INVALID>
> > *Sent:* 17 April 2019 20:49:29
> > *To:* dev
> > *Subject:* Re: java unit tests failing when starting MiniKuduCluster
> >
> > Hi Clemens,
> >
> > Could you run the negotiation-test (built without your patch) and attach
> > the output?  The test loops through various combinations of client/server
> > configurations, so it would be much easier to see what works and what
> not.
> > I suspect there might be some issues related to the combination of newer
> > glibc and OpenSSL libraries.
> >
> >
> > Kind regards,
> >
> > Alexey
> >
> > On Wed, Apr 17, 2019 at 3:07 AM Clemens Valiente <
> > Clemens.Valiente@trivago.com> wrote:
> >
> > > Hi Adar,
> > >
> > > thanks for looking into it.
> > >
> > > I had to apply the patches mentioned in KUDU-2770 to be able to build
> > kudu
> > > at all. As mentioned in the ticket, I am using fedora 29 which comes
> with
> > > glibc 2.28.
> > >
> > > I made sure that everything built correctly and ran the c++ tests and
> got
> > > many failing tests. They all seem to occur when trying to connect, the
> > > first stacktrace usually leads to
> > >
> > >
> > > 0417 12:01:53.737010 (+  3212us) negotiation.cc:304] Negotiation
> > complete:
> > > IO error: Server connection negotiation failed: server connection from
> > > 127.4.196.1:38781: received invalid message of size 386073344 which
> > > exceeds the rpc_max_message_size of 52428800 bytes
> > >
> > >
> > > This message size of 386073344 bytes is the same across all failed
> tests
> > > so I suspect that to be part of the issue, but I haven't been able to
> > > figure out what is inside that message. Is there any debug logging I
> > could
> > > activate on the tests? I can now easily reproduce a failure with
> > >
> > >
> > > bin/rpc_line_item_dao-test --gtest_filter=RpcLineItemDAOTest.TestInsert
> > >
> > >
> > > Thanks a lot for the help
> > >
> > > Clemens
> > >
> > >
> > > ------------------------------
> > > *From:* Adar Lieber-Dembo <ad...@cloudera.com.INVALID>
> > > *Sent:* 16 April 2019 21:15:23
> > > *To:* dev@kudu.apache.org
> > > *Subject:* Re: java unit tests failing when starting MiniKuduCluster
> > >
> > > I'm guessing it's something to do with your local system. You had
> > > previously filed KUDU-2770 regarding incompatibilities with newer
> > > versions of glibc; what system/distro are you running on? Are you
> > > using _any_ Kudu patches whatsoever, even just to get the build (or
> > > thirdparty build) working?
> > >
> > > FWIW, these tests pass locally for me (running against master) and
> > > they're passing in Kudu precommit tests regularly. Could you try to
> > > run the C++ test suite? They're easier to debug when they fail.
> > >
> > > On Tue, Apr 16, 2019 at 12:53 AM Clemens Valiente
> > > <Cl...@trivago.com> wrote:
> > > >
> > > > Hi,
> > > >
> > > >
> > > > I added these flags after the initial runs that complained about
> > > rpc_max_message_size exceeded in the tests. That got me one step
> further
> > to
> > > the connection failed error.
> > > >
> > > >
> > > > I tried running the tests both on master and on the kudu 1.9.0
> release
> > > (with the appropriate kudu build), with and without my patch.
> > > >
> > > > ________________________________
> > > > From: Adar Lieber-Dembo <ad...@cloudera.com.INVALID>
> > > > Sent: 14 April 2019 19:59:45
> > > > To: dev@kudu.apache.org
> > > > Subject: Re: java unit tests failing when starting MiniKuduCluster
> > > >
> > > > It might be related to your local changes. This caught my eye:
> > > >
> > > >   extra_master_flags: "--rpc_max_message_size=3860733440"
> > > >   extra_tserver_flags: "--rpc_max_message_size=3860733440"
> > > >
> > > > Why did you have to add this configuration? What happens if you
> remove
> > > > it? On a related note, do the tests still fail if you rebuild from a
> > > > clean working tree (i.e. without your patches)?
> > > >
> > > > On Sat, Apr 13, 2019 at 12:52 PM Clemens Valiente
> > > > <Cl...@trivago.com> wrote:
> > > > >
> > > > > Hi,
> > > > >
> > > > > I have a few fixes for the kudu-mapreduce package and wanted to add
> > > unit tests but neither the existing nor the newly added unit tests can
> be
> > > run on my system.
> > > > >
> > > > > I managed to build kudu and theMiniKuduCluster starts up but then
> > > fails with connection/timeout issues.
> > > > >
> > > > > I think this might point at the problem:
> > > > >
> > > > > 16:42:05.780 [INFO - cluster stderr printer]
> > > (MiniKuduCluster.java:543) 0410 16:42:05.779084 (+3053866us)
> > > negotiation.cc:304] Negotiation complete: Timed out: Server connection
> > > negotiation failed: server connection from 127.0.0.1:60698
> > > > >
> > > > > Though I have no explanation as to why kudu would not be able to
> make
> > > local connections within my machine. (fedora 29, OpenSSL 1.1.1b FIPS
> 26
> > > Feb 2019)
> > > > > I attached the full test report. Can someone help me figure out how
> > to
> > > further debug this, or is there any additional information I can
> provide?
> > > > >
> > > > > Best Regards
> > > > > Clemens Valiente
> > >
> >
>