You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@curator.apache.org by Erik Nelson <er...@gravity.com> on 2014/01/15 01:42:33 UTC

protection on ephemeral nodes can go haywire

I've noticed that it is quite common for my list of nodes to contain a bunch of entries that look like:

/_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-[actual node name]

sometimes even with the protection guid repeated again with the same actual node name. 

This can get to the point where I start get exceptions on the client where it complains about the node name needing to begin with '/'; I believe that the name exceeds the maximum size and is being truncated.

Is this a known issue? My use of persistentephemeralnode isn't too fancy, and this happens without any badness in the quorum. I've looked through the jira issues but can't find anything related.

Thanks,

erik

Re: protection on ephemeral nodes can go haywire

Posted by Erik Nelson <er...@gravity.com>.
That is similar, but due to sequential (appending a number), rather than protected (prepending a guid)

We have a lot of issues with connection flapping to zookeeper quorums, and I have some suspicion this is happening when connections get reset.

I just cleared out a large number of servers that were in the "node name too long and client gets an error on trying to create new node", so that the zookeeper server side log actually has a vaguely legible signal:noise ratio and I can figure out how the ever expanding protected node names starts. I certainly haven't ever been able to reproduce this in development!

On Jan 14, 2014, at 9:17 PM, Adarsh Bhat <ad...@gmail.com>> wrote:

CURATOR-56 sounds similar, but in a different recipe.

On Tuesday, January 14, 2014, Erik Nelson wrote:
Okay. The problem is, of course, that this happens in [heavily loaded] production. I'll see if I can replicate in dev to get that test case.

On Jan 14, 2014, at 4:52 PM, Jordan Zimmerman <jo...@jordanzimmerman.com> wrote:

OK - please add an issue on Curator’s Jira and a test case.

-JZ

________________________________
From: Erik Nelson Erik Nelson
Reply: Erik Nelson erik@gravity.com
Date: January 14, 2014 at 4:52:21 PM
To: Jordan Zimmerman jordan@jordanzimmerman.com
Subject:  Re: protection on ephemeral nodes can go haywire
This is curator 2.3 and zookeeper 3.4.5-cdh4.3.2

On Jan 14, 2014, at 4:44 PM, Jordan Zimmerman <jo...@jordanzimmerman.com> wrote:

That looks like a bug to me. The GUID should only be present once. I remember there being a bug like this a long time ago. What version are you using?

-JZ

________________________________
From: Erik Nelson Erik Nelson
Reply: user@curator.apache.org user@curator.apache.org
Date: January 14, 2014 at 4:43:02 PM
To: user@curator.apache.org user@curator.apache.org
Subject:  protection on ephemeral nodes can go haywire
I've noticed that it is quite common for my list of nodes to contain a bunch of entries that look like:

/_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-[actual node name]

sometimes even with the protection guid repeated again with the same actual node name.

This can get to the point where I start get exceptions on the client where it complains about the node name needing to begin with '/'; I believe that the name exceeds the maximum size and is being truncated.

Is this a known issue? My use of persistentephemeralnode isn't too fancy, and this happens without any badness in the


Re: protection on ephemeral nodes can go haywire

Posted by Adarsh Bhat <ad...@gmail.com>.
CURATOR-56 sounds similar, but in a different recipe.

On Tuesday, January 14, 2014, Erik Nelson wrote:

>  Okay. The problem is, of course, that this happens in [heavily loaded]
> production. I'll see if I can replicate in dev to get that test case.
>
>  On Jan 14, 2014, at 4:52 PM, Jordan Zimmerman <jo...@jordanzimmerman.com>
> wrote:
>
>   OK - please add an issue on Curator’s Jira and a test case.
>
>  -JZ
>
>   ------------------------------
> From: Erik Nelson Erik Nelson
> Reply: Erik Nelson erik@gravity.com
> Date: January 14, 2014 at 4:52:21 PM
> To: Jordan Zimmerman jordan@jordanzimmerman.com
> Subject:  Re: protection on ephemeral nodes can go haywire
>
>  This is curator 2.3 and zookeeper 3.4.5-cdh4.3.2
>
>  On Jan 14, 2014, at 4:44 PM, Jordan Zimmerman <jo...@jordanzimmerman.com>
> wrote:
>
>   That looks like a bug to me. The GUID should only be present once. I
> remember there being a bug like this a long time ago. What version are you
> using?
>
>  -JZ
>
>   ------------------------------
> From: Erik Nelson Erik Nelson
> Reply: user@curator.apache.org user@curator.apache.org
> Date: January 14, 2014 at 4:43:02 PM
> To: user@curator.apache.org user@curator.apache.org
> Subject:  protection on ephemeral nodes can go haywire
>
> I've noticed that it is quite common for my list of nodes to contain a
> bunch of entries that look like:
>
> /_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-[actual
> node name]
>
> sometimes even with the protection guid repeated again with the same
> actual node name.
>
> This can get to the point where I start get exceptions on the client where
> it complains about the node name needing to begin with '/'; I believe that
> the name exceeds the maximum size and is being truncated.
>
> Is this a known issue? My use of persistentephemeralnode isn't too fancy,
> and this happens without any badness in the
>
>

Re: protection on ephemeral nodes can go haywire

Posted by Erik Nelson <er...@gravity.com>.
Okay. The problem is, of course, that this happens in [heavily loaded] production. I'll see if I can replicate in dev to get that test case.

On Jan 14, 2014, at 4:52 PM, Jordan Zimmerman <jo...@jordanzimmerman.com>> wrote:

OK - please add an issue on Curator’s Jira and a test case.

-JZ

________________________________
From: Erik Nelson Erik Nelson<ma...@gravity.com>
Reply: Erik Nelson erik@gravity.com<ma...@gravity.com>
Date: January 14, 2014 at 4:52:21 PM
To: Jordan Zimmerman jordan@jordanzimmerman.com<ma...@jordanzimmerman.com>
Subject:  Re: protection on ephemeral nodes can go haywire
This is curator 2.3 and zookeeper 3.4.5-cdh4.3.2

On Jan 14, 2014, at 4:44 PM, Jordan Zimmerman <jo...@jordanzimmerman.com>> wrote:

That looks like a bug to me. The GUID should only be present once. I remember there being a bug like this a long time ago. What version are you using?

-JZ

________________________________
From: Erik Nelson Erik Nelson<ma...@gravity.com>
Reply: user@curator.apache.org<ma...@curator.apache.org> user@curator.apache.org<ma...@curator.apache.org>
Date: January 14, 2014 at 4:43:02 PM
To: user@curator.apache.org<ma...@curator.apache.org> user@curator.apache.org<ma...@curator.apache.org>
Subject:  protection on ephemeral nodes can go haywire
I've noticed that it is quite common for my list of nodes to contain a bunch of entries that look like:

/_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-[actual node name]

sometimes even with the protection guid repeated again with the same actual node name.

This can get to the point where I start get exceptions on the client where it complains about the node name needing to begin with '/'; I believe that the name exceeds the maximum size and is being truncated.

Is this a known issue? My use of persistentephemeralnode isn't too fancy, and this happens without any badness in the quorum. I've looked through the jira issues but can't find anything related.

Thanks,

erik


Re: protection on ephemeral nodes can go haywire

Posted by Jordan Zimmerman <jo...@jordanzimmerman.com>.
OK - please add an issue on Curator’s Jira and a test case.

-JZ

From: Erik Nelson Erik Nelson
Reply: Erik Nelson erik@gravity.com
Date: January 14, 2014 at 4:52:21 PM
To: Jordan Zimmerman jordan@jordanzimmerman.com
Subject:  Re: protection on ephemeral nodes can go haywire  
This is curator 2.3 and zookeeper 3.4.5-cdh4.3.2

On Jan 14, 2014, at 4:44 PM, Jordan Zimmerman <jo...@jordanzimmerman.com> wrote:

That looks like a bug to me. The GUID should only be present once. I remember there being a bug like this a long time ago. What version are you using?

-JZ

From: Erik Nelson Erik Nelson
Reply: user@curator.apache.org user@curator.apache.org
Date: January 14, 2014 at 4:43:02 PM
To: user@curator.apache.org user@curator.apache.org
Subject:  protection on ephemeral nodes can go haywire 
I've noticed that it is quite common for my list of nodes to contain a bunch of entries that look like: 

/_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-[actual node name] 

sometimes even with the protection guid repeated again with the same actual node name. 

This can get to the point where I start get exceptions on the client where it complains about the node name needing to begin with '/'; I believe that the name exceeds the maximum size and is being truncated. 

Is this a known issue? My use of persistentephemeralnode isn't too fancy, and this happens without any badness in the quorum. I've looked through the jira issues but can't find anything related. 

Thanks, 

erik


Re: protection on ephemeral nodes can go haywire

Posted by Erik Nelson <er...@gravity.com>.
This is curator 2.3 and zookeeper 3.4.5-cdh4.3.2

On Jan 14, 2014, at 4:44 PM, Jordan Zimmerman <jo...@jordanzimmerman.com>> wrote:

That looks like a bug to me. The GUID should only be present once. I remember there being a bug like this a long time ago. What version are you using?

-JZ

________________________________
From: Erik Nelson Erik Nelson<ma...@gravity.com>
Reply: user@curator.apache.org<ma...@curator.apache.org> user@curator.apache.org<ma...@curator.apache.org>
Date: January 14, 2014 at 4:43:02 PM
To: user@curator.apache.org<ma...@curator.apache.org> user@curator.apache.org<ma...@curator.apache.org>
Subject:  protection on ephemeral nodes can go haywire
I've noticed that it is quite common for my list of nodes to contain a bunch of entries that look like:

/_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-[actual node name]

sometimes even with the protection guid repeated again with the same actual node name.

This can get to the point where I start get exceptions on the client where it complains about the node name needing to begin with '/'; I believe that the name exceeds the maximum size and is being truncated.

Is this a known issue? My use of persistentephemeralnode isn't too fancy, and this happens without any badness in the quorum. I've looked through the jira issues but can't find anything related.

Thanks,

erik


Re: protection on ephemeral nodes can go haywire

Posted by Jordan Zimmerman <jo...@jordanzimmerman.com>.
That looks like a bug to me. The GUID should only be present once. I remember there being a bug like this a long time ago. What version are you using?

-JZ

From: Erik Nelson Erik Nelson
Reply: user@curator.apache.org user@curator.apache.org
Date: January 14, 2014 at 4:43:02 PM
To: user@curator.apache.org user@curator.apache.org
Subject:  protection on ephemeral nodes can go haywire  
I've noticed that it is quite common for my list of nodes to contain a bunch of entries that look like:  

/_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-_c_35528b0e-9e81-4bc2-8f3d-4e64198dc5cc-[actual node name]  

sometimes even with the protection guid repeated again with the same actual node name.  

This can get to the point where I start get exceptions on the client where it complains about the node name needing to begin with '/'; I believe that the name exceeds the maximum size and is being truncated.  

Is this a known issue? My use of persistentephemeralnode isn't too fancy, and this happens without any badness in the quorum. I've looked through the jira issues but can't find anything related.  

Thanks,  

erik