You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by Fangmin Lv <lv...@gmail.com> on 2019/10/28 01:23:35 UTC

String inconsistency issue when running ZK with OpenJDK 10 on SKL machines

Hey everyone,

(Forgot to add subject in the previous email, resent with clear subject.)

I'd like to share some weird inconsistency bugs we saw recently on prod,
the root cause and potential fixes of it. It took us around a month to
investigate, reproduce and find out the root cause, hopefully the
informations here will help people avoid hitting this same potential issue.

[Trigger conditions and behavior]

The inconsistency issue only happened when running ZK with OpenJDK 10 on
SKL machines, and it's not because of bugs inside ZK but due to a
macro-assembly bug inside JDK.

And the behavior of the issues might be:

* NONODE returned when getData from a child exist when queried with
getChildren, and there is no delete issued
* NONODE error returned when try to create a child based on the parent node
just successfully created, and there is no delete issued
* No client is able to acquire the lock even though the previous session
who hold the lock already dead

[Root cause]

The direct cause of the misbehavior above is due to the key/value put into
the ZooKeeperServer.outstandingChangesForPath HashMap or the
DataNode.children HashSet are not visible to the future get or remove,
which caused the outstanding changes not visible when leader prepare the
following txns, or node being deleted but not removed from
DataNode.children.

And the 'bad' HashMap/HashSet behavior is not because of concurrency bugs
inside ZK, but due to a macro-assembly bug which is used to generate the
String.equals intrinsic assembly code in JDK 9 and 10. The bug was
introduced in JDK-8144771 when adding AVX-512 instructions support in JDK
to optimize the String.equals intrinsic performance with 512 bit vector op
support. Due to the bug, the String.equals method may return false result
when using high band of CPU register (xmm16 - xmm31) with non-empty stack
on SKL machines where AVX-512 is available.

The macro-assembly bug we hit is in vptest which is used in the
string_compare macro assembly code
<http://hg.openjdk.java.net/jdk/jdk10/file/b09e56145e11/src/hotspot/cpu/x86/macroAssembler_x86.cpp#l4933>.
It uses add/sub instruction when saving/resuming register values
temporarily from stack, which will affect and distort the ZF (zero flag) in
FLAGS register from the previous test instruction.

For our case, if the key exist in the DataNode.children HashSet, the test
instruction result will be zero, ZF bit will be set to 1, if the RSP value
is not 0 (e.g stack is not empty) after addptr code here, then the ZF bit
will be changed to 0, so String.equals compare during removeNode will
return false result, and the key won't be removed.

There is bug reported in JDK-8207746, the behavior is different, we've
confirmed the issue by adding assembly code to log the issue in JDK 10.

[Solutions]

The possible mitigations are:

1. Disabling the AVX-512 with JVM option -XX:UseAVX=2
2. Using OpenJDK version higher than 10, which has fixed the issue in
JDK-8207746

Upgrading to OpenJDK 11+ is a better option, since 10 is not well
supported, and AVX-512 do helps improving performance.

We use JDK 10 due to SSL quorum socket close stall issue mentioned in
ZOOKEEPER-3384 <https://issues.apache.org/jira/browse/ZOOKEEPER-3384>, and
the SO_LINGER option is not honored in JDK 11. We've unblocked JDK 11 by
asynchronously closing the quorum socket, and we're upstreaming that in
ZOOKEEPER-3574 <https://issues.apache.org/jira/browse/ZOOKEEPER-3574>.

Thanks,
Fangmin

Re: String inconsistency issue when running ZK with OpenJDK 10 on SKL machines

Posted by Patrick Hunt <ph...@apache.org>.
On Mon, Oct 28, 2019 at 12:06 AM Enrico Olivelli <eo...@gmail.com>
wrote:

> Fangmin,
>
> Il lun 28 ott 2019, 02:23 Fangmin Lv <lv...@gmail.com> ha scritto:
>
> > Hey everyone,
> >
> > (Forgot to add subject in the previous email, resent with clear subject.)
> >
> > I'd like to share some weird inconsistency bugs we saw recently on prod,
> > the root cause and potential fixes of it. It took us around a month to
> > investigate, reproduce and find out the root cause, hopefully the
> > informations here will help people avoid hitting this same potential
> issue.
> >
> > [Trigger conditions and behavior]
> >
> > The inconsistency issue only happened when running ZK with OpenJDK 10 on
> > SKL machines, and it's not because of bugs inside ZK but due to a
> > macro-assembly bug inside JDK.
> >
> > And the behavior of the issues might be:
> >
> > * NONODE returned when getData from a child exist when queried with
> > getChildren, and there is no delete issued
> > * NONODE error returned when try to create a child based on the parent
> node
> > just successfully created, and there is no delete issued
> > * No client is able to acquire the lock even though the previous session
> > who hold the lock already dead
> >
> > [Root cause]
> >
> > The direct cause of the misbehavior above is due to the key/value put
> into
> > the ZooKeeperServer.outstandingChangesForPath HashMap or the
> > DataNode.children HashSet are not visible to the future get or remove,
> > which caused the outstanding changes not visible when leader prepare the
> > following txns, or node being deleted but not removed from
> > DataNode.children.
> >
> > And the 'bad' HashMap/HashSet behavior is not because of concurrency bugs
> > inside ZK, but due to a macro-assembly bug which is used to generate the
> > String.equals intrinsic assembly code in JDK 9 and 10. The bug was
> > introduced in JDK-8144771 when adding AVX-512 instructions support in JDK
> > to optimize the String.equals intrinsic performance with 512 bit vector
> op
> > support. Due to the bug, the String.equals method may return false result
> > when using high band of CPU register (xmm16 - xmm31) with non-empty stack
> > on SKL machines where AVX-512 is available.
> >
> > The macro-assembly bug we hit is in vptest which is used in the
> > string_compare macro assembly code
> > <
> >
> http://hg.openjdk.java.net/jdk/jdk10/file/b09e56145e11/src/hotspot/cpu/x86/macroAssembler_x86.cpp#l4933
> > >.
> > It uses add/sub instruction when saving/resuming register values
> > temporarily from stack, which will affect and distort the ZF (zero flag)
> in
> > FLAGS register from the previous test instruction.
> >
> > For our case, if the key exist in the DataNode.children HashSet, the test
> > instruction result will be zero, ZF bit will be set to 1, if the RSP
> value
> > is not 0 (e.g stack is not empty) after addptr code here, then the ZF bit
> > will be changed to 0, so String.equals compare during removeNode will
> > return false result, and the key won't be removed.
> >
> > There is bug reported in JDK-8207746, the behavior is different, we've
> > confirmed the issue by adding assembly code to log the issue in JDK 10.
> >
> > [Solutions]
> >
> > The possible mitigations are:
> >
> > 1. Disabling the AVX-512 with JVM option -XX:UseAVX=2
> > 2. Using OpenJDK version higher than 10, which has fixed the issue in
> > JDK-8207746
> >
> > Upgrading to OpenJDK 11+ is a better option, since 10 is not well
> > supported, and AVX-512 do helps improving performance.
> >
> > We use JDK 10 due to SSL quorum socket close stall issue mentioned in
> > ZOOKEEPER-3384 <https://issues.apache.org/jira/browse/ZOOKEEPER-3384>,
> and
> > the SO_LINGER option is not honored in JDK 11. We've unblocked JDK 11 by
> > asynchronously closing the quorum socket, and we're upstreaming that in
> > ZOOKEEPER-3574 <https://issues.apache.org/jira/browse/ZOOKEEPER-3574>.
> >
> > Thanks,
> > Fangmin
> >
>
>
> Thank you for sharing this.
>

10x - thanks!

Patrick


> Do you have any pointer to the jdk11 bugs? Is it solved in 12+?
>
> I am running with jdk11-13 but without ssl, so never seen problems.
>
> Enrico
>
> >
>

Re: String inconsistency issue when running ZK with OpenJDK 10 on SKL machines

Posted by Fangmin Lv <lv...@gmail.com>.
Enrico,

As Andor mentioned, the issue has been fixed in JDK 11 since b27, you
should be fine :)

Fangmin

On Mon, Oct 28, 2019 at 10:44 PM Andor Molnar <an...@apache.org> wrote:

> Here’s the JDK issue that Fangmin mentioned:
>
> https://bugs.openjdk.java.net/browse/JDK-8207746
>
> It’s a JDK 10 & 11 bug which has already been fixed since JDK11 b27.
>
> Andor
>
>
>
> > On 2019. Oct 28., at 8:00, Enrico Olivelli <eo...@gmail.com> wrote:
> >
> > Fangmin,
> >
> > Il lun 28 ott 2019, 02:23 Fangmin Lv <lv...@gmail.com> ha scritto:
> >
> >> Hey everyone,
> >>
> >> (Forgot to add subject in the previous email, resent with clear
> subject.)
> >>
> >> I'd like to share some weird inconsistency bugs we saw recently on prod,
> >> the root cause and potential fixes of it. It took us around a month to
> >> investigate, reproduce and find out the root cause, hopefully the
> >> informations here will help people avoid hitting this same potential
> issue.
> >>
> >> [Trigger conditions and behavior]
> >>
> >> The inconsistency issue only happened when running ZK with OpenJDK 10 on
> >> SKL machines, and it's not because of bugs inside ZK but due to a
> >> macro-assembly bug inside JDK.
> >>
> >> And the behavior of the issues might be:
> >>
> >> * NONODE returned when getData from a child exist when queried with
> >> getChildren, and there is no delete issued
> >> * NONODE error returned when try to create a child based on the parent
> node
> >> just successfully created, and there is no delete issued
> >> * No client is able to acquire the lock even though the previous session
> >> who hold the lock already dead
> >>
> >> [Root cause]
> >>
> >> The direct cause of the misbehavior above is due to the key/value put
> into
> >> the ZooKeeperServer.outstandingChangesForPath HashMap or the
> >> DataNode.children HashSet are not visible to the future get or remove,
> >> which caused the outstanding changes not visible when leader prepare the
> >> following txns, or node being deleted but not removed from
> >> DataNode.children.
> >>
> >> And the 'bad' HashMap/HashSet behavior is not because of concurrency
> bugs
> >> inside ZK, but due to a macro-assembly bug which is used to generate the
> >> String.equals intrinsic assembly code in JDK 9 and 10. The bug was
> >> introduced in JDK-8144771 when adding AVX-512 instructions support in
> JDK
> >> to optimize the String.equals intrinsic performance with 512 bit vector
> op
> >> support. Due to the bug, the String.equals method may return false
> result
> >> when using high band of CPU register (xmm16 - xmm31) with non-empty
> stack
> >> on SKL machines where AVX-512 is available.
> >>
> >> The macro-assembly bug we hit is in vptest which is used in the
> >> string_compare macro assembly code
> >> <
> >>
> http://hg.openjdk.java.net/jdk/jdk10/file/b09e56145e11/src/hotspot/cpu/x86/macroAssembler_x86.cpp#l4933
> >>> .
> >> It uses add/sub instruction when saving/resuming register values
> >> temporarily from stack, which will affect and distort the ZF (zero
> flag) in
> >> FLAGS register from the previous test instruction.
> >>
> >> For our case, if the key exist in the DataNode.children HashSet, the
> test
> >> instruction result will be zero, ZF bit will be set to 1, if the RSP
> value
> >> is not 0 (e.g stack is not empty) after addptr code here, then the ZF
> bit
> >> will be changed to 0, so String.equals compare during removeNode will
> >> return false result, and the key won't be removed.
> >>
> >> There is bug reported in JDK-8207746, the behavior is different, we've
> >> confirmed the issue by adding assembly code to log the issue in JDK 10.
> >>
> >> [Solutions]
> >>
> >> The possible mitigations are:
> >>
> >> 1. Disabling the AVX-512 with JVM option -XX:UseAVX=2
> >> 2. Using OpenJDK version higher than 10, which has fixed the issue in
> >> JDK-8207746
> >>
> >> Upgrading to OpenJDK 11+ is a better option, since 10 is not well
> >> supported, and AVX-512 do helps improving performance.
> >>
> >> We use JDK 10 due to SSL quorum socket close stall issue mentioned in
> >> ZOOKEEPER-3384 <https://issues.apache.org/jira/browse/ZOOKEEPER-3384>,
> and
> >> the SO_LINGER option is not honored in JDK 11. We've unblocked JDK 11 by
> >> asynchronously closing the quorum socket, and we're upstreaming that in
> >> ZOOKEEPER-3574 <https://issues.apache.org/jira/browse/ZOOKEEPER-3574>.
> >>
> >> Thanks,
> >> Fangmin
> >>
> >
> >
> > Thank you for sharing this.
> > Do you have any pointer to the jdk11 bugs? Is it solved in 12+?
> >
> > I am running with jdk11-13 but without ssl, so never seen problems.
> >
> > Enrico
> >
> >>
>
>

Re: String inconsistency issue when running ZK with OpenJDK 10 on SKL machines

Posted by Andor Molnar <an...@apache.org>.
Here’s the JDK issue that Fangmin mentioned:

https://bugs.openjdk.java.net/browse/JDK-8207746

It’s a JDK 10 & 11 bug which has already been fixed since JDK11 b27.

Andor



> On 2019. Oct 28., at 8:00, Enrico Olivelli <eo...@gmail.com> wrote:
> 
> Fangmin,
> 
> Il lun 28 ott 2019, 02:23 Fangmin Lv <lv...@gmail.com> ha scritto:
> 
>> Hey everyone,
>> 
>> (Forgot to add subject in the previous email, resent with clear subject.)
>> 
>> I'd like to share some weird inconsistency bugs we saw recently on prod,
>> the root cause and potential fixes of it. It took us around a month to
>> investigate, reproduce and find out the root cause, hopefully the
>> informations here will help people avoid hitting this same potential issue.
>> 
>> [Trigger conditions and behavior]
>> 
>> The inconsistency issue only happened when running ZK with OpenJDK 10 on
>> SKL machines, and it's not because of bugs inside ZK but due to a
>> macro-assembly bug inside JDK.
>> 
>> And the behavior of the issues might be:
>> 
>> * NONODE returned when getData from a child exist when queried with
>> getChildren, and there is no delete issued
>> * NONODE error returned when try to create a child based on the parent node
>> just successfully created, and there is no delete issued
>> * No client is able to acquire the lock even though the previous session
>> who hold the lock already dead
>> 
>> [Root cause]
>> 
>> The direct cause of the misbehavior above is due to the key/value put into
>> the ZooKeeperServer.outstandingChangesForPath HashMap or the
>> DataNode.children HashSet are not visible to the future get or remove,
>> which caused the outstanding changes not visible when leader prepare the
>> following txns, or node being deleted but not removed from
>> DataNode.children.
>> 
>> And the 'bad' HashMap/HashSet behavior is not because of concurrency bugs
>> inside ZK, but due to a macro-assembly bug which is used to generate the
>> String.equals intrinsic assembly code in JDK 9 and 10. The bug was
>> introduced in JDK-8144771 when adding AVX-512 instructions support in JDK
>> to optimize the String.equals intrinsic performance with 512 bit vector op
>> support. Due to the bug, the String.equals method may return false result
>> when using high band of CPU register (xmm16 - xmm31) with non-empty stack
>> on SKL machines where AVX-512 is available.
>> 
>> The macro-assembly bug we hit is in vptest which is used in the
>> string_compare macro assembly code
>> <
>> http://hg.openjdk.java.net/jdk/jdk10/file/b09e56145e11/src/hotspot/cpu/x86/macroAssembler_x86.cpp#l4933
>>> .
>> It uses add/sub instruction when saving/resuming register values
>> temporarily from stack, which will affect and distort the ZF (zero flag) in
>> FLAGS register from the previous test instruction.
>> 
>> For our case, if the key exist in the DataNode.children HashSet, the test
>> instruction result will be zero, ZF bit will be set to 1, if the RSP value
>> is not 0 (e.g stack is not empty) after addptr code here, then the ZF bit
>> will be changed to 0, so String.equals compare during removeNode will
>> return false result, and the key won't be removed.
>> 
>> There is bug reported in JDK-8207746, the behavior is different, we've
>> confirmed the issue by adding assembly code to log the issue in JDK 10.
>> 
>> [Solutions]
>> 
>> The possible mitigations are:
>> 
>> 1. Disabling the AVX-512 with JVM option -XX:UseAVX=2
>> 2. Using OpenJDK version higher than 10, which has fixed the issue in
>> JDK-8207746
>> 
>> Upgrading to OpenJDK 11+ is a better option, since 10 is not well
>> supported, and AVX-512 do helps improving performance.
>> 
>> We use JDK 10 due to SSL quorum socket close stall issue mentioned in
>> ZOOKEEPER-3384 <https://issues.apache.org/jira/browse/ZOOKEEPER-3384>, and
>> the SO_LINGER option is not honored in JDK 11. We've unblocked JDK 11 by
>> asynchronously closing the quorum socket, and we're upstreaming that in
>> ZOOKEEPER-3574 <https://issues.apache.org/jira/browse/ZOOKEEPER-3574>.
>> 
>> Thanks,
>> Fangmin
>> 
> 
> 
> Thank you for sharing this.
> Do you have any pointer to the jdk11 bugs? Is it solved in 12+?
> 
> I am running with jdk11-13 but without ssl, so never seen problems.
> 
> Enrico
> 
>> 


Re: String inconsistency issue when running ZK with OpenJDK 10 on SKL machines

Posted by Patrick Hunt <ph...@apache.org>.
On Mon, Oct 28, 2019 at 12:06 AM Enrico Olivelli <eo...@gmail.com>
wrote:

> Fangmin,
>
> Il lun 28 ott 2019, 02:23 Fangmin Lv <lv...@gmail.com> ha scritto:
>
> > Hey everyone,
> >
> > (Forgot to add subject in the previous email, resent with clear subject.)
> >
> > I'd like to share some weird inconsistency bugs we saw recently on prod,
> > the root cause and potential fixes of it. It took us around a month to
> > investigate, reproduce and find out the root cause, hopefully the
> > informations here will help people avoid hitting this same potential
> issue.
> >
> > [Trigger conditions and behavior]
> >
> > The inconsistency issue only happened when running ZK with OpenJDK 10 on
> > SKL machines, and it's not because of bugs inside ZK but due to a
> > macro-assembly bug inside JDK.
> >
> > And the behavior of the issues might be:
> >
> > * NONODE returned when getData from a child exist when queried with
> > getChildren, and there is no delete issued
> > * NONODE error returned when try to create a child based on the parent
> node
> > just successfully created, and there is no delete issued
> > * No client is able to acquire the lock even though the previous session
> > who hold the lock already dead
> >
> > [Root cause]
> >
> > The direct cause of the misbehavior above is due to the key/value put
> into
> > the ZooKeeperServer.outstandingChangesForPath HashMap or the
> > DataNode.children HashSet are not visible to the future get or remove,
> > which caused the outstanding changes not visible when leader prepare the
> > following txns, or node being deleted but not removed from
> > DataNode.children.
> >
> > And the 'bad' HashMap/HashSet behavior is not because of concurrency bugs
> > inside ZK, but due to a macro-assembly bug which is used to generate the
> > String.equals intrinsic assembly code in JDK 9 and 10. The bug was
> > introduced in JDK-8144771 when adding AVX-512 instructions support in JDK
> > to optimize the String.equals intrinsic performance with 512 bit vector
> op
> > support. Due to the bug, the String.equals method may return false result
> > when using high band of CPU register (xmm16 - xmm31) with non-empty stack
> > on SKL machines where AVX-512 is available.
> >
> > The macro-assembly bug we hit is in vptest which is used in the
> > string_compare macro assembly code
> > <
> >
> http://hg.openjdk.java.net/jdk/jdk10/file/b09e56145e11/src/hotspot/cpu/x86/macroAssembler_x86.cpp#l4933
> > >.
> > It uses add/sub instruction when saving/resuming register values
> > temporarily from stack, which will affect and distort the ZF (zero flag)
> in
> > FLAGS register from the previous test instruction.
> >
> > For our case, if the key exist in the DataNode.children HashSet, the test
> > instruction result will be zero, ZF bit will be set to 1, if the RSP
> value
> > is not 0 (e.g stack is not empty) after addptr code here, then the ZF bit
> > will be changed to 0, so String.equals compare during removeNode will
> > return false result, and the key won't be removed.
> >
> > There is bug reported in JDK-8207746, the behavior is different, we've
> > confirmed the issue by adding assembly code to log the issue in JDK 10.
> >
> > [Solutions]
> >
> > The possible mitigations are:
> >
> > 1. Disabling the AVX-512 with JVM option -XX:UseAVX=2
> > 2. Using OpenJDK version higher than 10, which has fixed the issue in
> > JDK-8207746
> >
> > Upgrading to OpenJDK 11+ is a better option, since 10 is not well
> > supported, and AVX-512 do helps improving performance.
> >
> > We use JDK 10 due to SSL quorum socket close stall issue mentioned in
> > ZOOKEEPER-3384 <https://issues.apache.org/jira/browse/ZOOKEEPER-3384>,
> and
> > the SO_LINGER option is not honored in JDK 11. We've unblocked JDK 11 by
> > asynchronously closing the quorum socket, and we're upstreaming that in
> > ZOOKEEPER-3574 <https://issues.apache.org/jira/browse/ZOOKEEPER-3574>.
> >
> > Thanks,
> > Fangmin
> >
>
>
> Thank you for sharing this.
>

10x - thanks!

Patrick


> Do you have any pointer to the jdk11 bugs? Is it solved in 12+?
>
> I am running with jdk11-13 but without ssl, so never seen problems.
>
> Enrico
>
> >
>

Re: String inconsistency issue when running ZK with OpenJDK 10 on SKL machines

Posted by Enrico Olivelli <eo...@gmail.com>.
Fangmin,

Il lun 28 ott 2019, 02:23 Fangmin Lv <lv...@gmail.com> ha scritto:

> Hey everyone,
>
> (Forgot to add subject in the previous email, resent with clear subject.)
>
> I'd like to share some weird inconsistency bugs we saw recently on prod,
> the root cause and potential fixes of it. It took us around a month to
> investigate, reproduce and find out the root cause, hopefully the
> informations here will help people avoid hitting this same potential issue.
>
> [Trigger conditions and behavior]
>
> The inconsistency issue only happened when running ZK with OpenJDK 10 on
> SKL machines, and it's not because of bugs inside ZK but due to a
> macro-assembly bug inside JDK.
>
> And the behavior of the issues might be:
>
> * NONODE returned when getData from a child exist when queried with
> getChildren, and there is no delete issued
> * NONODE error returned when try to create a child based on the parent node
> just successfully created, and there is no delete issued
> * No client is able to acquire the lock even though the previous session
> who hold the lock already dead
>
> [Root cause]
>
> The direct cause of the misbehavior above is due to the key/value put into
> the ZooKeeperServer.outstandingChangesForPath HashMap or the
> DataNode.children HashSet are not visible to the future get or remove,
> which caused the outstanding changes not visible when leader prepare the
> following txns, or node being deleted but not removed from
> DataNode.children.
>
> And the 'bad' HashMap/HashSet behavior is not because of concurrency bugs
> inside ZK, but due to a macro-assembly bug which is used to generate the
> String.equals intrinsic assembly code in JDK 9 and 10. The bug was
> introduced in JDK-8144771 when adding AVX-512 instructions support in JDK
> to optimize the String.equals intrinsic performance with 512 bit vector op
> support. Due to the bug, the String.equals method may return false result
> when using high band of CPU register (xmm16 - xmm31) with non-empty stack
> on SKL machines where AVX-512 is available.
>
> The macro-assembly bug we hit is in vptest which is used in the
> string_compare macro assembly code
> <
> http://hg.openjdk.java.net/jdk/jdk10/file/b09e56145e11/src/hotspot/cpu/x86/macroAssembler_x86.cpp#l4933
> >.
> It uses add/sub instruction when saving/resuming register values
> temporarily from stack, which will affect and distort the ZF (zero flag) in
> FLAGS register from the previous test instruction.
>
> For our case, if the key exist in the DataNode.children HashSet, the test
> instruction result will be zero, ZF bit will be set to 1, if the RSP value
> is not 0 (e.g stack is not empty) after addptr code here, then the ZF bit
> will be changed to 0, so String.equals compare during removeNode will
> return false result, and the key won't be removed.
>
> There is bug reported in JDK-8207746, the behavior is different, we've
> confirmed the issue by adding assembly code to log the issue in JDK 10.
>
> [Solutions]
>
> The possible mitigations are:
>
> 1. Disabling the AVX-512 with JVM option -XX:UseAVX=2
> 2. Using OpenJDK version higher than 10, which has fixed the issue in
> JDK-8207746
>
> Upgrading to OpenJDK 11+ is a better option, since 10 is not well
> supported, and AVX-512 do helps improving performance.
>
> We use JDK 10 due to SSL quorum socket close stall issue mentioned in
> ZOOKEEPER-3384 <https://issues.apache.org/jira/browse/ZOOKEEPER-3384>, and
> the SO_LINGER option is not honored in JDK 11. We've unblocked JDK 11 by
> asynchronously closing the quorum socket, and we're upstreaming that in
> ZOOKEEPER-3574 <https://issues.apache.org/jira/browse/ZOOKEEPER-3574>.
>
> Thanks,
> Fangmin
>


Thank you for sharing this.
Do you have any pointer to the jdk11 bugs? Is it solved in 12+?

I am running with jdk11-13 but without ssl, so never seen problems.

Enrico

>

Re: String inconsistency issue when running ZK with OpenJDK 10 on SKL machines

Posted by Enrico Olivelli <eo...@gmail.com>.
Fangmin,

Il lun 28 ott 2019, 02:23 Fangmin Lv <lv...@gmail.com> ha scritto:

> Hey everyone,
>
> (Forgot to add subject in the previous email, resent with clear subject.)
>
> I'd like to share some weird inconsistency bugs we saw recently on prod,
> the root cause and potential fixes of it. It took us around a month to
> investigate, reproduce and find out the root cause, hopefully the
> informations here will help people avoid hitting this same potential issue.
>
> [Trigger conditions and behavior]
>
> The inconsistency issue only happened when running ZK with OpenJDK 10 on
> SKL machines, and it's not because of bugs inside ZK but due to a
> macro-assembly bug inside JDK.
>
> And the behavior of the issues might be:
>
> * NONODE returned when getData from a child exist when queried with
> getChildren, and there is no delete issued
> * NONODE error returned when try to create a child based on the parent node
> just successfully created, and there is no delete issued
> * No client is able to acquire the lock even though the previous session
> who hold the lock already dead
>
> [Root cause]
>
> The direct cause of the misbehavior above is due to the key/value put into
> the ZooKeeperServer.outstandingChangesForPath HashMap or the
> DataNode.children HashSet are not visible to the future get or remove,
> which caused the outstanding changes not visible when leader prepare the
> following txns, or node being deleted but not removed from
> DataNode.children.
>
> And the 'bad' HashMap/HashSet behavior is not because of concurrency bugs
> inside ZK, but due to a macro-assembly bug which is used to generate the
> String.equals intrinsic assembly code in JDK 9 and 10. The bug was
> introduced in JDK-8144771 when adding AVX-512 instructions support in JDK
> to optimize the String.equals intrinsic performance with 512 bit vector op
> support. Due to the bug, the String.equals method may return false result
> when using high band of CPU register (xmm16 - xmm31) with non-empty stack
> on SKL machines where AVX-512 is available.
>
> The macro-assembly bug we hit is in vptest which is used in the
> string_compare macro assembly code
> <
> http://hg.openjdk.java.net/jdk/jdk10/file/b09e56145e11/src/hotspot/cpu/x86/macroAssembler_x86.cpp#l4933
> >.
> It uses add/sub instruction when saving/resuming register values
> temporarily from stack, which will affect and distort the ZF (zero flag) in
> FLAGS register from the previous test instruction.
>
> For our case, if the key exist in the DataNode.children HashSet, the test
> instruction result will be zero, ZF bit will be set to 1, if the RSP value
> is not 0 (e.g stack is not empty) after addptr code here, then the ZF bit
> will be changed to 0, so String.equals compare during removeNode will
> return false result, and the key won't be removed.
>
> There is bug reported in JDK-8207746, the behavior is different, we've
> confirmed the issue by adding assembly code to log the issue in JDK 10.
>
> [Solutions]
>
> The possible mitigations are:
>
> 1. Disabling the AVX-512 with JVM option -XX:UseAVX=2
> 2. Using OpenJDK version higher than 10, which has fixed the issue in
> JDK-8207746
>
> Upgrading to OpenJDK 11+ is a better option, since 10 is not well
> supported, and AVX-512 do helps improving performance.
>
> We use JDK 10 due to SSL quorum socket close stall issue mentioned in
> ZOOKEEPER-3384 <https://issues.apache.org/jira/browse/ZOOKEEPER-3384>, and
> the SO_LINGER option is not honored in JDK 11. We've unblocked JDK 11 by
> asynchronously closing the quorum socket, and we're upstreaming that in
> ZOOKEEPER-3574 <https://issues.apache.org/jira/browse/ZOOKEEPER-3574>.
>
> Thanks,
> Fangmin
>


Thank you for sharing this.
Do you have any pointer to the jdk11 bugs? Is it solved in 12+?

I am running with jdk11-13 but without ssl, so never seen problems.

Enrico

>