You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tomcat.apache.org by bu...@apache.org on 2023/06/20 20:41:03 UTC

[Bug 66660] New: StaticMember doesn't support lazy hostname resolution (useful in K8s)

https://bz.apache.org/bugzilla/show_bug.cgi?id=66660

            Bug ID: 66660
           Summary: StaticMember doesn't support lazy hostname resolution
                    (useful in K8s)
           Product: Tomcat 9
           Version: 9.0.x
          Hardware: All
                OS: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Cluster
          Assignee: dev@tomcat.apache.org
          Reporter: diego.rivera@armedia.com
  Target Milestone: -----

In many (most) vendor-provided K8s environments, UDP is not allowed to be used
within clusters. Thus, one is forced to use static TCP member lists pointing to
all the pods that will be members of the cluster:

<!-- Cluster with 3 members: pod-0, pod-1, and pod-2 -->
<LocalMember .../> <!-- this is pod-0 -->
<Member host="pod-1" ... />
<Member host="pod-2" ... />

If, for whatever reason, there's a need to boot up the pods in OrderedReady (as
opposed to Parallel), then pod-0 will be booted before the others, and the
StaticMember implementations will fail to resolve the hosts "pod-1" and
"pod-2", because they will not exist in K8s DNS yet (b/c the pods won't exist
until the first pod is in the Ready state).

Thus, the first pod will be left with a broken member list: later on the
members will be there, but since they weren't around during initialization, the
member list is left empty.

Instead of caching the hostname once on construction, the IP address
(getByName() result) should only be cached if the member is confirmed to be
healthy. If the member is initializing, then it's OK to resolve the hostname
every time (perhaps add an attribute to limit the number of hostname resolution
retries?). If the member is down and being polled for health it's Ok to resolve
the hostname every time as well.

One possible solution is to encapsulate MemberImpl.host in a method that does
the lookup and caching/de-caching based on the member's state.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 66660] StaticMember doesn't support lazy hostname resolution (useful in K8s)

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=66660

Diego Rivera <di...@armedia.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|WONTFIX                     |---
             Status|RESOLVED                    |REOPENED

--- Comment #14 from Diego Rivera <di...@armedia.com> ---
(In reply to jfclere from comment #13)
> The DNS service and the logic in tomcat cluster expects the pods to be ready
> otherwise the whole stuff is useless.
> The lookup is done on the service which returns all the available pods
> Probably your problems are related to the readyness probe you are using.

This is an incorrect assumption. The service only returns its own IP. This is
by design. At least, that's how it's worked on every K8s instance I've worked
with (vanilla K8s and EKS).

Also, this specific application doesn't support concurrent startup, so *it's
impossible for all pods to exist at the same time*. Hence the need to add the
"concept" (if you will) of an *expected member* that may not be available when
the first pod comes up (b/c the other pods may not even exist yet).

Once the first pod becomes ready, the 2nd pod will be created, and eventually
come online. Then the 3rd, 4th, etc...

Therefore it's OK for the cluster members to all know that EVENTUALLY a certain
set of pods will be part of the cluster. The problem is that this is not
possible to achieve without access to the K8s API, and I've already explained
why this is not a desirable option.

This isn't something that can be addressed with the existing code (I've
checked), so I'm going to take the (probably unwelcome) step of reopening the
ticket.

So, again, the DNS Provider won't work b/c the service won't return all pod
IP's (it can't, by design, due to DNS caching!), and the K8s provider requires
access to the K8s API which in some of our deployments won't be allowed.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 66660] StaticMember doesn't support lazy hostname resolution (useful in K8s)

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=66660

--- Comment #21 from Diego Rivera <di...@armedia.com> ---
(In reply to romain.manni-bucau from comment #16)

So... I tried to configure the K8s provider as per the docs, and other
resources I found (https://github.com/devlinx9/k8s_tomcat_custer).  Per those
resources, this configuration should be both correct and sufficient:

      <Cluster className="org.apache.catalina.ha.tcp.SimpleTcpCluster"
               channelStartOptions="3"
               channelSendOptions="8">
        <Channel className="org.apache.catalina.tribes.group.GroupChannel">
          <Membership
className="org.apache.catalina.tribes.membership.cloud.CloudMembershipService"/>
        </Channel>
      </Cluster>

I also set the KUBERNETES_LABELS envvar correctly (tested manually using CURL,
compared vs. what the code does), and set the KUBERNETES_NAMESPACE variable as
well.

KUBERNETES_LABELS:
"app.kubernetes.io/instance=myapp,app.kubernetes.io/name=mypod"
KUBERNETES_NAMESPACE: "default"

Basically, I did everything double-checking that the code *should* do what is
expected. Testing the direct-API access (using curl) from within the pod(s)
works just fine, so the access control is also configured correctly. I even
used the same token file as the code does by default
(/var/run/secrets/kubernetes.io/serviceaccount/token), as well as the CA
(/var/run/secrets/kubernetes.io/serviceaccount/ca.crt).

The curl version of the fetchMembers() query worked just fine.

However, I don't see any messages flowing regarding cluster members being
added/removed. Specifically, I enabled JMX and am looking at the Cluster MBeans
(Both Catalina/Cluster and ClusterChannel), and there are no members to be
found. 

The hasMembers() method even returns false!!

So ... help?

The mailing list archives only show 5 hits for KubernetesMembershipProvider,
and only one other thread for CloudMembershipService ...

Cheers...

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 66660] StaticMember doesn't support lazy hostname resolution (useful in K8s)

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=66660

--- Comment #16 from romain.manni-bucau <rm...@gmail.com> ---
Sounds like what you want is to start the cluster only after which means you
assume that none of the N-1 pods serve anything before pod N is started.
Think this is a high assumptiong for tomcat.

If not, DNS service sounds like a correct compromise if you don't want to watch
kubernetes API to not have to setup a service account.

Last option remains to implement a custom membership strategy, with the
specificty of your deployment I think it can be a not so bad option and it
could even make sense to use tomcat cluster events for the instance X to notify
[1;X-1] instance it is started and join the cluster (a bit like multicast impl
but without multicast need)....but really think this is a very particular
deployment mode and I'm not sure it is sane to propagate or encourage to be
honest.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 66660] StaticMember doesn't support lazy hostname resolution (useful in K8s)

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=66660

--- Comment #2 from Mark Thomas <ma...@apache.org> ---
It looks like it needs more Javadoc and some documentation but I suspect using
the k8s specific membership provider would be a better solution:

https://tomcat.apache.org/tomcat-9.0-doc/api/org/apache/catalina/tribes/membership/cloud/package-summary.html

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 66660] StaticMember doesn't support lazy hostname resolution (useful in K8s)

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=66660

--- Comment #4 from Diego Rivera <di...@armedia.com> ---
(In reply to Mark Thomas from comment #2)
> It looks like it needs more Javadoc and some documentation but I suspect
> using the k8s specific membership provider would be a better solution:
> 
> https://tomcat.apache.org/tomcat-9.0-doc/api/org/apache/catalina/tribes/
> membership/cloud/package-summary.html

We're very much trying to avoid granting access to the K8s infrastructure to
our pods, for security reasons.

I saw that class and it piqued my interest, but quickly fizzled when I saw how
it works. We have the same conundrum: when it fetches the membership list, only
one pod exists (the one being booted up), so we'd be back to square one
regarding "expected members which don't yet exist or aren't yet available".

The problem is that the StaticMember expects all members to exist upon
instantiation, whereas this may not always be the case.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


Re: [Bug 66660] StaticMember doesn't support lazy hostname resolution (useful in K8s)

Posted by Christopher Schultz <ch...@christopherschultz.net>.
All,

On 6/21/23 13:14, bugzilla@apache.org wrote:
> https://bz.apache.org/bugzilla/show_bug.cgi?id=66660
> 
> --- Comment #20 from Diego Rivera <di...@armedia.com> ---
> (In reply to romain.manni-bucau from comment #16)
>> Sounds like what you want is to start the cluster only after which means you
>> assume that none of the N-1 pods serve anything before pod N is started.
>> Think this is a high assumptiong for tomcat.
>>
>> If not, DNS service sounds like a correct compromise if you don't want to
>> watch kubernetes API to not have to setup a service account.
>>
>> Last option remains to implement a custom membership strategy, with the
>> specificty of your deployment I think it can be a not so bad option and it
>> could even make sense to use tomcat cluster events for the instance X to
>> notify [1;X-1] instance it is started and join the cluster (a bit like
>> multicast impl but without multicast need)....but really think this is a
>> very particular deployment mode and I'm not sure it is sane to propagate or
>> encourage to be honest.
> 
> Well, what I want is to start the cluster with not all nodes up, and enable
> late-starting nodes to join the cluster successfully. This isn't too far
> fetched.
> 
> In the olden days of servers and VMs this wasn't an issue b/c the server/VM
> hostnames could be expected to be in DNS already.
> 
> This no longer applies in the era of ephemeral containers, where they may come
> up and down at will and without warning (crash or planned outage, regardless).
> 
> Let's assume that I was actually able to start my pods in parallel (I'll grant
> you this is an unexpected shortcoming of the application in question, but I
> digress), and the DNS names are resolved successfully.
> 
> This is possible, works just fine, and I've done it ... before I discovered the
> "must start serially" BS I'm having to find a workaround for.  However, here's
> the catch: if one of the members listed (by hostname, of course) dies, and has
> to be replaced, that new member is now unable to rejoin the cluster since the
> other nodes in the cluster won't accept it as a member, b/c it's not in the
> list of static members, b/c its IP has now changed.
> 
> I'm currently fiddling with the MemberImpl code, and I may have a compromise
> that works fairly well. I'll submit a preliminary patch to this thread in the
> next few hours (without tests, mind you ... just for feedback ... once the
> approach is vetted I'll make sure to dot all the t's and cross all the i's).

I don't want to pollute this BZ with this comment because it may be way 
off-base. I'm no Tomcat-cluster expert, nor am I more than 
passingly-familiar with k8s.

It seems that the problem is trying to define a dynamic cluster with 
static membership. Wouldn't switching to dynamic membership solve 
everything?

-chris

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 66660] StaticMember doesn't support lazy hostname resolution (useful in K8s)

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=66660

--- Comment #20 from Diego Rivera <di...@armedia.com> ---
(In reply to romain.manni-bucau from comment #16)
> Sounds like what you want is to start the cluster only after which means you
> assume that none of the N-1 pods serve anything before pod N is started.
> Think this is a high assumptiong for tomcat.
> 
> If not, DNS service sounds like a correct compromise if you don't want to
> watch kubernetes API to not have to setup a service account.
> 
> Last option remains to implement a custom membership strategy, with the
> specificty of your deployment I think it can be a not so bad option and it
> could even make sense to use tomcat cluster events for the instance X to
> notify [1;X-1] instance it is started and join the cluster (a bit like
> multicast impl but without multicast need)....but really think this is a
> very particular deployment mode and I'm not sure it is sane to propagate or
> encourage to be honest.

Well, what I want is to start the cluster with not all nodes up, and enable
late-starting nodes to join the cluster successfully. This isn't too far
fetched.

In the olden days of servers and VMs this wasn't an issue b/c the server/VM
hostnames could be expected to be in DNS already.

This no longer applies in the era of ephemeral containers, where they may come
up and down at will and without warning (crash or planned outage, regardless).

Let's assume that I was actually able to start my pods in parallel (I'll grant
you this is an unexpected shortcoming of the application in question, but I
digress), and the DNS names are resolved successfully.

This is possible, works just fine, and I've done it ... before I discovered the
"must start serially" BS I'm having to find a workaround for.  However, here's
the catch: if one of the members listed (by hostname, of course) dies, and has
to be replaced, that new member is now unable to rejoin the cluster since the
other nodes in the cluster won't accept it as a member, b/c it's not in the
list of static members, b/c its IP has now changed.

I'm currently fiddling with the MemberImpl code, and I may have a compromise
that works fairly well. I'll submit a preliminary patch to this thread in the
next few hours (without tests, mind you ... just for feedback ... once the
approach is vetted I'll make sure to dot all the t's and cross all the i's).

Cheers!

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 66660] StaticMember doesn't support lazy hostname resolution (useful in K8s)

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=66660

--- Comment #7 from Diego Rivera <di...@armedia.com> ---

> Sure, but you're asking for a dynamic static membership list here, basically.


Kind of, but not. The only thing "dynamic" is the IP address to which each
hostname would resolve to. If the hostname string is an IP address, well then
so be it - there's no updating it and its end result will always be the same.

What I'm asking for is a static membership list, with static hostname
*strings*, where the resulting IP address said strings resolve to is allowed to
change during the life of the Member.

Also, I'm not saying that the name-to-address mapping should be allowed to
change at any arbitrary moment: it should only be set once the member is
confirmed to exist, and should only be cleared if the member disappears. The
hostname string wouldn't change - only the IP address.

The special edge case where the hostname string is an actual IP address really
doesn't need any special handling as the resulting IP bytes won't ever change,
and the overhead of computing them is negligible in the grand scheme of things.

So yes - a little bit of dynamism, but not enough to require a significant
rewrite of the Static membership classes (at least, I hope not :D).

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 66660] StaticMember doesn't support lazy hostname resolution (useful in K8s)

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=66660

--- Comment #3 from Diego Rivera <di...@armedia.com> ---
(In reply to romain.manni-bucau from comment #1)
> Hi, probably not yet mainstream but did you evaluated to use a custom DNS
> resolver impl delegating to the JVM when the host values were not known to
> be the pod ones? (https://bugs.openjdk.org/browse/JDK-8263693). Think it
> makes this usage quite straight forward, just requires to add the custom
> impl in tomcat launching classpath (with bootstrap.jar - or directly in the
> jvm classpath if embedded) and be it. Can avoid to put a specific strategy
> in tomcat and test with the more appropriated one before potentially giving
> it back, wdyt?

That wouldn't help.

The problem is *when* the DNS resolution happens, not "by whom". The MemberImpl
code does the resolution during construction and expects to store the IP
address data - whatever it is - at that point.

If resolution fails, construction fails and thus the member isn't added to the
cluster - i.e. there's no provision for an "expected member": the member's
hostname is required to exist at the moment the object is instantiated.

Naturally we can't add an IP address b/c no such address exists yet (or else
we'd already have a hostname).

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 66660] StaticMember doesn't support lazy hostname resolution (useful in K8s)

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=66660

--- Comment #8 from romain.manni-bucau <rm...@gmail.com> ---
Facade your pods by services or just impl a custom.discovery but static one
looks like a wrong usage for you and not willing to use k8s api will require
another registration mechanism anyway

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 66660] StaticMember doesn't support lazy hostname resolution (useful in K8s)

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=66660

--- Comment #10 from Remy Maucherat <re...@apache.org> ---
(In reply to Diego Rivera from comment #9)
> I already covered the downside of attempting to use the K8s API: the member
> list is compiled up front during membership startup as well, and is never
> updated afterwards at runtime, so I'd end up with an incomplete/inaccurate
> member list anyway (because, remember, the member pods may not be up or even
> exist yet).

Both the DNS and K8 API membership providers retrieve the list of currently
active pods and work with that, so it is dynamic. The DNS one is a bit less
reliable, but it usually works ...

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 66660] StaticMember doesn't support lazy hostname resolution (useful in K8s)

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=66660

Diego Rivera <di...@armedia.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|REOPENED                    |RESOLVED
         Resolution|---                         |WONTFIX

--- Comment #22 from Diego Rivera <di...@armedia.com> ---
(In reply to Diego Rivera from comment #21)
>       <Cluster className="org.apache.catalina.ha.tcp.SimpleTcpCluster"
>                channelStartOptions="3"
>                channelSendOptions="8">

Apparently those options attributes break the K8s stuff ... I removed them and
everything started ticking along quite happily. So at least there's that
avenue.

I also found this:
https://redisson.org/articles/redis-based-tomcat-session-management.html#:~:text=Redisson's%20Tomcat%20Session%20Manager%20allows,might%20serialize%20the%20whole%20session.

Using Redis as the session propagator may be a solid option instead, since I
can have a single Redis instance for all the tomcat components (yes, I have
different apps in the ecosystem which each require their own tomcat ... FML :(
).

But that's beyond the scope of this thread.

Also, I had a patch almost worked out until I realized that it would break
other stuff b/c of how the Member interface was conceived (i.e. with the
thought of a member's IP never changing).

For now, we can return to a RESOLVED status.

Sorry for the extra fuss ...

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 66660] StaticMember doesn't support lazy hostname resolution (useful in K8s)

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=66660

--- Comment #6 from Remy Maucherat <re...@apache.org> ---
(In reply to Diego Rivera from comment #4)
> (In reply to Mark Thomas from comment #2)
> > It looks like it needs more Javadoc and some documentation but I suspect
> > using the k8s specific membership provider would be a better solution:
> > 
> > https://tomcat.apache.org/tomcat-9.0-doc/api/org/apache/catalina/tribes/
> > membership/cloud/package-summary.html
> 
> We're very much trying to avoid granting access to the K8s infrastructure to
> our pods, for security reasons.

I don't really understand the security concern since this is so granular, and
it seems the whole point. But anyway, there is also the K8 DNS based membership
provider:
https://tomcat.apache.org/tomcat-9.0-doc/api/org/apache/catalina/tribes/membership/cloud/DNSMembershipProvider.html
I personally prefer the regular provider since it is more reliable, though.

> I saw that class and it piqued my interest, but quickly fizzled when I saw
> how it works. We have the same conundrum: when it fetches the membership
> list, only one pod exists (the one being booted up), so we'd be back to
> square one regarding "expected members which don't yet exist or aren't yet
> available".
> 
> The problem is that the StaticMember expects all members to exist upon
> instantiation, whereas this may not always be the case.

Sure, but you're asking for a dynamic static membership list here, basically.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 66660] StaticMember doesn't support lazy hostname resolution (useful in K8s)

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=66660

Remy Maucherat <re...@apache.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |WONTFIX
             Status|NEW                         |RESOLVED

--- Comment #11 from Remy Maucherat <re...@apache.org> ---
The static member list seems to work as designed as far as I am concerned
(servers don't exist, DNS names don't resolve). Given all the explanations
given, this seems more appropriate as a discussion on the users list at the
moment.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 66660] StaticMember doesn't support lazy hostname resolution (useful in K8s)

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=66660

--- Comment #23 from Remy Maucherat <re...@apache.org> ---
(In reply to Diego Rivera from comment #22)
> (In reply to Diego Rivera from comment #21)
> >       <Cluster className="org.apache.catalina.ha.tcp.SimpleTcpCluster"
> >                channelStartOptions="3"
> >                channelSendOptions="8">
> 
> Apparently those options attributes break the K8s stuff ... I removed them
> and everything started ticking along quite happily. So at least there's that
> avenue.

I had a look to verify there was no problem, and channelStartOptions allows
doing the startup in multiple controlled steps. Using "3", the membership
service is not started, so the membership provider is not either. Some other
component has to do the other start steps explicitly.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 66660] StaticMember doesn't support lazy hostname resolution (useful in K8s)

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=66660

--- Comment #9 from Diego Rivera <di...@armedia.com> ---
(In reply to romain.manni-bucau from comment #8)
> Facade your pods by services or just impl a custom.discovery but static one
> looks like a wrong usage for you and not willing to use k8s api will require
> another registration mechanism anyway

Facading by a service would be useless since I'd have no control which pod the
service would route the member-query connection to (could be itself, could be
another pod ... in this case, it would be itself since it'd be the only pod up,
which would be useless!).

I already covered the downside of attempting to use the K8s API: the member
list is compiled up front during membership startup as well, and is never
updated afterwards at runtime, so I'd end up with an incomplete/inaccurate
member list anyway (because, remember, the member pods may not be up or even
exist yet).

I could definitely implement a custom discovery scheme, but since making that
adjustment to the static members would be enough, I thought it might be a solid
alternative.

Hell, we can even make the new behavior optional - i.e. add an attribute called
"dynamicResolution" with a boolean value to enable/disable the new behavior ...

Cheers.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 66660] StaticMember doesn't support lazy hostname resolution (useful in K8s)

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=66660

--- Comment #1 from romain.manni-bucau <rm...@gmail.com> ---
Hi, probably not yet mainstream but did you evaluated to use a custom DNS
resolver impl delegating to the JVM when the host values were not known to be
the pod ones? (https://bugs.openjdk.org/browse/JDK-8263693). Think it makes
this usage quite straight forward, just requires to add the custom impl in
tomcat launching classpath (with bootstrap.jar - or directly in the jvm
classpath if embedded) and be it. Can avoid to put a specific strategy in
tomcat and test with the more appropriated one before potentially giving it
back, wdyt?

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 66660] StaticMember doesn't support lazy hostname resolution (useful in K8s)

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=66660

--- Comment #12 from Diego Rivera <di...@armedia.com> ---
(In reply to Remy Maucherat from comment #10)
> (In reply to Diego Rivera from comment #9)
> > I already covered the downside of attempting to use the K8s API: the member
> > list is compiled up front during membership startup as well, and is never
> > updated afterwards at runtime, so I'd end up with an incomplete/inaccurate
> > member list anyway (because, remember, the member pods may not be up or even
> > exist yet).
> 
> Both the DNS and K8 API membership providers retrieve the list of currently
> active pods and work with that, so it is dynamic. The DNS one is a bit less
> reliable, but it usually works ...

Well, the DNSMembershipProvider implementation depends on the service's DNS
entry returning all member pod IPs on query, which in some cases it does not
since in K8s each Service gets its own IP, which is then routed over to the
pod's IPs. This is by design to avoid DNS caches routing requests to pods which
have disappeared when you really only should be accessing them via the Service.

Thus, it wouldn't solve our predicament b/c the cluster would be made up of a
single member: the service's IP address.

The KubernetesMembershipProvider requires granting access to the K8s API, which
may be out of the question in some deployments due to security concerns. The
ideal solution to this scenario would be oblivious to whether it's being
deployed in K8s, Docker Swarm, or any other orchestration/clustering
infrastructure.

After reading the code a little more calmly, these are the changes I would
suggest which would solve the issue while having near-0 impact on the existing
code's functionality (all in MemberImpl.java):

* Delay the call to getByName() until getHost() is called
* Remove setHost() altogether (not really needed anymore)
* Change the MemberImpl.host field to transient
* Change the MemberImpl.hostname field to non-transient
* Update the static MemberImpl.getMember() method to use setHostname() instead
of setHost(), since the host's IP would no longer be serialized (nor would we
want it to, b/c we want to resolve the hostname each time it's required).

From where I'm sitting, the impact of the above changes would be that the
serialized Member instances would be slightly larger since now they'd transport
the hostname string vs. the actual IP address's byte[], and the slight DNS
lookup impact from having to resolve the hostname whenever getHost() is called.

If this impact is deemed unacceptable, I'm sure the code could be made smart
enough to cache the value until the member is marked as failed (not sure how
that would be done from MemberImpl's perspective, but I'm sure something can be
worked out).

Then again, I'd like an expert's opinion on whether the above changes would
have any unforeseen impact beyond what I've described. If they don't, I'll be
happy to spend some time on this and submit a patch proposal.

Finally, this new functionality could be turned on/off with a flag (i.e.
delayedLookup=true?) if there's enough aversion to the change becoming the
default (which I can definitely understand).

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 66660] StaticMember doesn't support lazy hostname resolution (useful in K8s)

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=66660

--- Comment #18 from jfclere <jf...@gmail.com> ---
Try https://github.com/jfclere/DNSPing-tomcat too

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 66660] StaticMember doesn't support lazy hostname resolution (useful in K8s)

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=66660

--- Comment #5 from Diego Rivera <di...@armedia.com> ---
To clarify: the problem is that MemberImpl expects the target member to exist
or be resolvable at object construction. If that resolution fails, then it
pukes out and the member is then discarded by the overarching infrastructure,
even if the member will become available shortly thereafter.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 66660] StaticMember doesn't support lazy hostname resolution (useful in K8s)

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=66660

--- Comment #17 from jfclere <jf...@gmail.com> ---
I need the pod.yaml and the service.yaml you are using, otherwise it is just
guessing.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 66660] StaticMember doesn't support lazy hostname resolution (useful in K8s)

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=66660

--- Comment #15 from Diego Rivera <di...@armedia.com> ---
To clarify re: the Service only returning its own IP.

In K8s, the pods behind the Service facade may change at any time, and without
warning. DNS caches are also a thing. So imagine this scenario: from your app
(Tomcat, for instance) you query the Service to get a list of 10 pods that are
currently servicing them. The pod's DNS cache will keep a copy of those same
pods for a certain amount of time to avoid repeat lookups.

But lo ... a couple of seconds after you get that list, those 10 pods are gone
and 10 new pods with newer versions of the service/app in question are now up,
with 10 different IPs.

* Now you're unable to access any of the old IPs b/c they no longer exist
* You're unable to resolve to the new IPs due to DNS caching (which means you'd
have to either turn it off, or know when to nuke it)

So, essentially, you have a fun problem to fix.

Instead, if each service gets their own IP address (which is how it is), then
you ALWAYS go to that IP for the service, and it's up to the K8s subsystems to
finagle the firewalls/routing/whatnot to move the traffic to the right pods,
unbeknownst and invisibly to the service's clients.

This is why the DNSMembershipProvider is useless in K8s. It's coded based on a
flawed (perhaps even outdated) assumption.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 66660] StaticMember doesn't support lazy hostname resolution (useful in K8s)

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=66660

--- Comment #13 from jfclere <jf...@gmail.com> ---
The DNS service and the logic in tomcat cluster expects the pods to be ready
otherwise the whole stuff is useless.
The lookup is done on the service which returns all the available pods
Probably your problems are related to the readyness probe you are using.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 66660] StaticMember doesn't support lazy hostname resolution (useful in K8s)

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=66660

--- Comment #19 from Diego Rivera <di...@armedia.com> ---
(In reply to jfclere from comment #17)
> I need the pod.yaml and the service.yaml you are using, otherwise it is just
> guessing.

Trust me, they wouldn't make a difference.

Services are designed to be blackbox facades for pods. You know there are pods
behind the service, but you're not supposed to know which pods without actually
looking through the K8s API (which, again, has been covered).

Go ahead and deploy that in K8s (I don't know if K3s or K3d will do things
differently), and you'll see that the Service will get one IP, which is
different from the pod IP.

Specifically, if you add a 2nd or 3rd or 4th pod, you won't get their IPs when
you run a DNS query for the Service name.

Cheers.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org