You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@tomcat.apache.org by Scott Bradshaw <sc...@chillaxing.com> on 2009/04/15 21:50:49 UTC

JK 1.2.28 - load balancer worker fails on startup with one worker down ?

I apologize if this a silly question, but I can't figure it out! I've looked
over the documentation and I'm stumped.

I have 5 load balanced workers defined. I have them setup and configured
correctly.

workers.properties file (partial - not including all the individual workers)
---------------------------------------------------------------
worker.mygpgby02.type=ajp13
worker.mygpgby02.host=mygpgby02.mycompany.com
worker.mygpgby02.port=8009

worker.loadbalancerprod.type=lb
worker.loadbalancerprod.balance_workers=mygpgby02,mygpgby03,mygpgby04,mygpgby05,mygpgby06
worker.list=loadbalancerprod
---------------------------------------------------------------
Now, here is the problem - one of those hosts(mygpgby06) is currently down
for maintenence. Whenever I startup IIS, the ISAPI proxy won't work. The
ISAPI log file shows this:  (please note mycompany.com is not the actual url
- i changed it)

[Wed Apr 15 14:22:00.463 2009] [4208:2848] [error] jk_ajp_common.c (2526):
worker mygpgby06 can't resolve tomcat address mygpgby06.mycompany.com
[Wed Apr 15 14:22:00.463 2009] [4208:2848] [error] jk_worker.c (163):
validate failed for mygpgby06
[Wed Apr 15 14:22:00.463 2009] [4208:2848] [error] jk_lb_worker.c (1599):
Failed creating worker mygpgby06
[Wed Apr 15 14:22:00.479 2009] [4208:2848] [error] jk_lb_worker.c (1647):
NULL parameters
[Wed Apr 15 14:22:00.479 2009] [4208:2848] [error] jk_worker.c (163):
validate failed for loadbalancerprod
[Wed Apr 15 14:22:00.479 2009] [4208:2848] [error] jk_worker.c (262): failed
to create worker loadbalancerprod
[Wed Apr 15 14:22:00.479 2009] [4208:2848] [error] jk_uri_worker_map.c
(506): Could not find worker with name 'loadbalancerprod' in uri map post
processing.

 If I take this worker out of the balance_workers list, everything starts up
fine. If I leave it in, my loadbalancerprod worker is completely dead..

 According to the documentation, "When starting up, the web server plugin
with instantiate the workers whose name appears in the worker.list
property..".

So - one worker in the load balancer won't start so the whole load balancer
is considered a failed worker.

Is there a property I'm missing to make this work ?

Scott

Re: JK 1.2.28 - load balancer worker fails on startup with one worker down ?

Posted by fredk2 <fr...@gmail.com>.
Hi,

I understand that when it comes to security you do not want to start the
service eg. if the certificate is corrupted you do not want the ssl server
to start <full stop> or if Apache cannot bind to the hostname then it cannot
start, etc... .
However, in this case there can be a few reasons why a tomcat server dns
entry is removed or decommissionned or dropped.  Imagine one apache whith
many non-related workers (not just one lb)*.
So I understand that one does not want the service to fail during a restart
(nitely,automated or by an operator) while you investigate the problem ...
It seems to me more acceptable to have a potential performance degradation
vs a loss of the whole service.

so +1 on changing the fatal to non-fatal assuming the code change is one
line. I take scalability/stability over feature :-) 

Rgds - Fred


awarnier wrote:
> 
> Rainer Jung wrote:
> [...]
>> What remains for me is your suggestion, that the error is not a fatal
>> one, since there are other balanced workers left. We could include such
>> a check in the startup code, although I'm not really convinced, that
>> your problem is a good reason for this.
>> 
>> I'm open to more argumntation and suggestions :)
>> 
> Argumentation #1 against a change in logic:
> The OP argues that one single unresolvable balanced worker should not 
> stop the other 4 from working, hence that the balancer should start 
> anyway, since 80% of the capacity is still available.  It sounds 
> reasonable in principle.
> But what if there are only 2 balanced workers in total, of which one is 
> unresolvable at start ? would it be normal to start with only one 
> balanced worker available anyway ?
> If not, then where's the limit of "acceptable" ?
> 
> Argumentation #2 against a change in logic:
> Suppose the balancer would start, with the resolved workers only.
> Suppose the resolving problem comes from a typo, not the fact that the 
> given host is temporarily out of the DNS system, but a definite 
> non-existing host.  It will not be retried, so there will never be 
> another error/warning message. The host itself may be ok and respond to 
> pings etc.., it will just never be hit by Apache's mod_jk, so this would 
> be a very quiet error.
> How is the sysadmin going to figure out that there is, basically, a 
> problem ?
> 
> Argumentation for a change in logging:
> It would be clearer if the error message stated explicitly that "the 
> balancer worker was not started due to a /configuration/ error, see 
> above message(s)".
> 
> But then, if even I could figure it out from the existing error message, 
> then just about everyone should be able to.
> And what would be the use of the likes of me, if everything was clear ?
> ;-)
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/JK-1.2.28---load-balancer-worker-fails-on-startup-with-one-worker--down---tp23065939p23099365.html
Sent from the Tomcat - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: JK 1.2.28 - load balancer worker fails on startup with one worker down ?

Posted by Rainer Jung <ra...@kippdata.de>.
On 17.04.2009 21:44, David Fisher wrote:
> Rainer -
> 
> Wouldn't this type of dynamics occur if your workers were in a cloud? Or
> if you needed a lot more very quickly for peak processing?

Yes, but then you should extend your dynamics to your other
configurations as well. In fact there are tendencies now with
mod_cluster and mod_heartbeat to detect farm topology automatically. But
with mod_jk we are not there yet. So you need to take care of the
configuration with external mechanisms.

> Am I correct to think that if someone is being so "dynamic" with their
> worker's DNS configuration then they should automate using the status
> worker to handle configuration after initial startup of mod_jk?
> 
> http://tomcat.apache.org/connectors-doc/reference/status.html
> 
> Unfortunately, I don't see a way to add or remove a worker? I think that
> this type of dynamic configuration would be very helpful in managing a
> cloud of tomcat workers.

The status worker does not allow to persist the changes. This will come
soon :)

> I would be pretty cool to be able to put Tomcat Workers into service
> merely by starting up a VM that uses DHCP to get its IP by name. The VM
> could then know to register itself with a mod_jk status worker. Wouldn't
> this make a Tomcat Cloud super easy to manage? It could even be
> geographically diverse and migratory.

As noted above there are brand new modules for dynamic farm topology
detection. I exoect a lot of improvement in this area in the next 1-2 years.

Regards,

Rainer

> On Apr 17, 2009, at 10:16 AM, Rainer Jung wrote:
> 
>> On 17.04.2009 18:02, André Warnier wrote:
>>> To my knowledge, the only case where the DNS would fail to provide an IP
>>> address of a correctly-written FQDN name, is if you have some
>>> configuration where your hosts register themselves under some variable
>>> IP address when they startup.  But that would be a strange setup for
>>> servers, no ?
>>
>> Maybe some combination with DHCP? But I still don't like the concept of
>> this type of dynamics.
>>
>> Regards,
>>
>> Rainer

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: JK 1.2.28 - load balancer worker fails on startup with one worker down ?

Posted by David Fisher <df...@jmlafferty.com>.
Rainer -

Wouldn't this type of dynamics occur if your workers were in a cloud?  
Or if you needed a lot more very quickly for peak processing?

Am I correct to think that if someone is being so "dynamic" with their  
worker's DNS configuration then they should automate using the status  
worker to handle configuration after initial startup of mod_jk?

http://tomcat.apache.org/connectors-doc/reference/status.html

Unfortunately, I don't see a way to add or remove a worker? I think  
that this type of dynamic configuration would be very helpful in  
managing a cloud of tomcat workers.

I would be pretty cool to be able to put Tomcat Workers into service  
merely by starting up a VM that uses DHCP to get its IP by name. The  
VM could then know to register itself with a mod_jk status worker.  
Wouldn't this make a Tomcat Cloud super easy to manage? It could even  
be geographically diverse and migratory.

Regards,
Dave

On Apr 17, 2009, at 10:16 AM, Rainer Jung wrote:

> On 17.04.2009 18:02, André Warnier wrote:
>> To my knowledge, the only case where the DNS would fail to provide  
>> an IP
>> address of a correctly-written FQDN name, is if you have some
>> configuration where your hosts register themselves under some  
>> variable
>> IP address when they startup.  But that would be a strange setup for
>> servers, no ?
>
> Maybe some combination with DHCP? But I still don't like the concept  
> of
> this type of dynamics.
>
> Regards,
>
> Rainer
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: JK 1.2.28 - load balancer worker fails on startup with one worker down ?

Posted by Rainer Jung <ra...@kippdata.de>.
On 17.04.2009 18:02, André Warnier wrote:
> To my knowledge, the only case where the DNS would fail to provide an IP
> address of a correctly-written FQDN name, is if you have some
> configuration where your hosts register themselves under some variable
> IP address when they startup.  But that would be a strange setup for
> servers, no ?

Maybe some combination with DHCP? But I still don't like the concept of
this type of dynamics.

Regards,

Rainer

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: JK 1.2.28 - load balancer worker fails on startup with one worker down ?

Posted by André Warnier <aw...@ice-sa.com>.
David Fisher wrote:
> An interesting discussion. Since I am about to configure such a load 
> balancer and we prefer to use DNS, understanding this type of detail is 
> critical.
> 
> The OP said that the reason that the DNS did not resolve was that the 
> machine had been moved off the network. That may have been an event out 
> of the control of the sysadmin for the web service. Suppose someone 
> takes a server away and then a week later apache and mod_jk are 
> restarted via a cron job in the middle of the night? Suddenly the web 
> service is down.
> 
I think you misunderstand the issue.
If a server (hosting a Tomcat corresponding to a worker) is simply down, 
or not there, that will not stop mod_jk starting up.
What stops it starting up, is that /the DNS lookup for this machine's IP 
address does not work/.
As long as the DNS system is working, it will provide an IP address, and 
mod_jk will be happy and start up.
Later, when mod_jk will try to access that worker (always by IP), it 
will notice that it is down, and just mark it so and check it from time 
to time until it is up again.

The point of Rainer was that if in your configuration you use DNS names 
instead of IP addresses, then it means that you consider DNS a pretty 
important part of your application/setup.
Thus if at startup the DNS system is not working for whatever reason, it 
will prevent mod_jk to start up properly.
But only then.

To my knowledge, the only case where the DNS would fail to provide an IP 
address of a correctly-written FQDN name, is if you have some 
configuration where your hosts register themselves under some variable 
IP address when they startup.  But that would be a strange setup for 
servers, no ?



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: JK 1.2.28 - load balancer worker fails on startup with one worker down ?

Posted by David Fisher <df...@jmlafferty.com>.
An interesting discussion. Since I am about to configure such a load  
balancer and we prefer to use DNS, understanding this type of detail  
is critical.

The OP said that the reason that the DNS did not resolve was that the  
machine had been moved off the network. That may have been an event  
out of the control of the sysadmin for the web service. Suppose  
someone takes a server away and then a week later apache and mod_jk  
are restarted via a cron job in the middle of the night? Suddenly the  
web service is down.

I think that one could argue that if a configuration has been  
successful then this error should be a warning and if the  
configuration file has been altered since the past time it is a fatal  
error. That may be too much extra logic and it is non-deterministic as  
a configuration file change is hard to detect accurately.

Perhaps more helpful would be to have a sysadmin email address in the  
config and then when things go fatal send an email with the  
appropriate log information. It is all about catching appropriately  
thrown error classes. What logging facility does mod_jk use? It could  
be that plugging in a special logger in this situation makes sense.

Regards,
Dave

On Apr 17, 2009, at 5:28 AM, André Warnier wrote:

> Rainer Jung wrote:
> [...]
>> What remains for me is your suggestion, that the error is not a fatal
>> one, since there are other balanced workers left. We could include  
>> such
>> a check in the startup code, although I'm not really convinced, that
>> your problem is a good reason for this.
>> I'm open to more argumntation and suggestions :)
> Argumentation #1 against a change in logic:
> The OP argues that one single unresolvable balanced worker should  
> not stop the other 4 from working, hence that the balancer should  
> start anyway, since 80% of the capacity is still available.  It  
> sounds reasonable in principle.
> But what if there are only 2 balanced workers in total, of which one  
> is unresolvable at start ? would it be normal to start with only one  
> balanced worker available anyway ?
> If not, then where's the limit of "acceptable" ?
>
> Argumentation #2 against a change in logic:
> Suppose the balancer would start, with the resolved workers only.
> Suppose the resolving problem comes from a typo, not the fact that  
> the given host is temporarily out of the DNS system, but a definite  
> non-existing host.  It will not be retried, so there will never be  
> another error/warning message. The host itself may be ok and respond  
> to pings etc.., it will just never be hit by Apache's mod_jk, so  
> this would be a very quiet error.
> How is the sysadmin going to figure out that there is, basically, a  
> problem ?
>
> Argumentation for a change in logging:
> It would be clearer if the error message stated explicitly that "the  
> balancer worker was not started due to a /configuration/ error, see  
> above message(s)".
>
> But then, if even I could figure it out from the existing error  
> message, then just about everyone should be able to.
> And what would be the use of the likes of me, if everything was  
> clear ?
> ;-)
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: JK 1.2.28 - load balancer worker fails on startup with one worker down ?

Posted by János Löbb <ja...@yale.edu>.
On Apr 17, 2009, at 8:28 AM, André Warnier wrote:

> Rainer Jung wrote:
> [...]
>> What remains for me is your suggestion, that the error is not a fatal
>> one, since there are other balanced workers left. We could include  
>> such
>> a check in the startup code, although I'm not really convinced, that
>> your problem is a good reason for this.
>> I'm open to more argumntation and suggestions :)
> Argumentation #1 against a change in logic:
> The OP argues that one single unresolvable balanced worker should  
> not stop the other 4 from working, hence that the balancer should  
> start anyway, since 80% of the capacity is still available.  It  
> sounds reasonable in principle.
> But what if there are only 2 balanced workers in total, of which one  
> is unresolvable at start ? would it be normal to start with only one  
> balanced worker available anyway ?
> If not, then where's the limit of "acceptable" ?
>
> Argumentation #2 against a change in logic:
> Suppose the balancer would start, with the resolved workers only.
> Suppose the resolving problem comes from a typo, not the fact that  
> the given host is temporarily out of the DNS system, but a definite  
> non-existing host.  It will not be retried, so there will never be  
> another error/warning message. The host itself may be ok and respond  
> to pings etc.., it will just never be hit by Apache's mod_jk, so  
> this would be a very quiet error.
> How is the sysadmin going to figure out that there is, basically, a  
> problem ?
>
> Argumentation for a change in logging:
> It would be clearer if the error message stated explicitly that "the  
> balancer worker was not started due to a /configuration/ error, see  
> above message(s)".
>
> But then, if even I could figure it out from the existing error  
> message, then just about everyone should be able to.
> And what would be the use of the likes of me, if everything was  
> clear ?
> ;-)
>

Perheps with a variable:

JkQuorum

This could be set between two and the number of workers under the  
command of the loadbalancer.  If it is not set, then old behavior can  
be followed, or be set internally to 2 or to the tomcat instances  
balanced.

János

Re: JK 1.2.28 - load balancer worker fails on startup with one worker down ?

Posted by André Warnier <aw...@ice-sa.com>.
Rainer Jung wrote:
[...]
> What remains for me is your suggestion, that the error is not a fatal
> one, since there are other balanced workers left. We could include such
> a check in the startup code, although I'm not really convinced, that
> your problem is a good reason for this.
> 
> I'm open to more argumntation and suggestions :)
> 
Argumentation #1 against a change in logic:
The OP argues that one single unresolvable balanced worker should not 
stop the other 4 from working, hence that the balancer should start 
anyway, since 80% of the capacity is still available.  It sounds 
reasonable in principle.
But what if there are only 2 balanced workers in total, of which one is 
unresolvable at start ? would it be normal to start with only one 
balanced worker available anyway ?
If not, then where's the limit of "acceptable" ?

Argumentation #2 against a change in logic:
Suppose the balancer would start, with the resolved workers only.
Suppose the resolving problem comes from a typo, not the fact that the 
given host is temporarily out of the DNS system, but a definite 
non-existing host.  It will not be retried, so there will never be 
another error/warning message. The host itself may be ok and respond to 
pings etc.., it will just never be hit by Apache's mod_jk, so this would 
be a very quiet error.
How is the sysadmin going to figure out that there is, basically, a 
problem ?

Argumentation for a change in logging:
It would be clearer if the error message stated explicitly that "the 
balancer worker was not started due to a /configuration/ error, see 
above message(s)".

But then, if even I could figure it out from the existing error message, 
then just about everyone should be able to.
And what would be the use of the likes of me, if everything was clear ?
;-)

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: JK 1.2.28 - load balancer worker fails on startup with one worker down ?

Posted by Rainer Jung <ra...@kippdata.de>.
On 16.04.2009 17:55, Scott Bradshaw wrote:
> Still continuing to guess..
>> This is about efficiency.
>> If mod_jk had to do a DNS lookup each time it wants to send a packet to a
>> backend Tomcat (or at least each time it wants to create a new connection to
>> a backend Tomcat), that would be very inefficient.
>>
>> So, instead, mod_jk stores the IP address of each backend Tomcat, and
>> during normal operation uses only that stored IP.
>>
>> But, as a convenience, in the configuration file, it allows you to specify
>> the worker's host as a name if you wish; and if you do that, it does the
>> lookup once at startup, to resolve that to an IP which it stores.
>>
>> But if it cannot at startup do this lookup and resolve the name to an IP,
>> then it is stuck and cannot go further.
> 
> 
> Agreed - it cannot go further for this worker. I would expect an error to
> get logged and the load balanced worker to continue on to the next worker
> configured.
> 
> 
>> Because if it did go further, then it would have to store this workers'host
>> as an unresolved name, and then it would have to do a lookup during normal
>> operation.
>> CQFD.
>>
>> Now, I have really no idea if the code is really like that, but if not at
>> least it seems logical, doesn't it ?
>> ;-)
>>
>>
> Yes, it does seem logical, but flawed for a load balancer. If it was a
> normal worker, I would 100% agree it should quit.
> 
> I will go ahead and configure all the Tomcat nodes by IP address in the
> mod_jk config file instead of the host name. The risk of having our whole
> production web application go down if one host is not available is not worth
> this convenient "feature" of mod_jk.  :-)

André is very close to our original reasoning: As he explained, it is
only necessary, that the address is resolvable during startup. This is
not related to whether the node does work or not. The latter is detected
by the balancer dynamically.

Why do we force the address to be resolvable during startup?

This is because we think an address that has already formally no chance
to work most likely presents a configuration error.

I could phrase it like this: if you are using names as addresses
(instead of IPs), then it is critical, that your DNS does work (and
contain the data). In your case the missing data in DNS is close to a
failed DNS and this will be a problem for almost all applications
configured with names.

Yes, there are too many error messages in this situation, but the first
one said:

worker mygpgby06 can't resolve tomcat address mygpgby06.mycompany.com

So its not completely unlikely you can spot there's a name resolution
problem.

What remains for me is your suggestion, that the error is not a fatal
one, since there are other balanced workers left. We could include such
a check in the startup code, although I'm not really convinced, that
your problem is a good reason for this.

I'm open to more argumntation and suggestions :)

Regards,

Rainer




---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: JK 1.2.28 - load balancer worker fails on startup with one worker down ?

Posted by Scott Bradshaw <sw...@gmail.com>.
Still continuing to guess..
>
> This is about efficiency.
> If mod_jk had to do a DNS lookup each time it wants to send a packet to a
> backend Tomcat (or at least each time it wants to create a new connection to
> a backend Tomcat), that would be very inefficient.
>
> So, instead, mod_jk stores the IP address of each backend Tomcat, and
> during normal operation uses only that stored IP.
>
> But, as a convenience, in the configuration file, it allows you to specify
> the worker's host as a name if you wish; and if you do that, it does the
> lookup once at startup, to resolve that to an IP which it stores.
>
> But if it cannot at startup do this lookup and resolve the name to an IP,
> then it is stuck and cannot go further.


Agreed - it cannot go further for this worker. I would expect an error to
get logged and the load balanced worker to continue on to the next worker
configured.


> Because if it did go further, then it would have to store this workers'host
> as an unresolved name, and then it would have to do a lookup during normal
> operation.
> CQFD.
>
> Now, I have really no idea if the code is really like that, but if not at
> least it seems logical, doesn't it ?
> ;-)
>
>
Yes, it does seem logical, but flawed for a load balancer. If it was a
normal worker, I would 100% agree it should quit.

I will go ahead and configure all the Tomcat nodes by IP address in the
mod_jk config file instead of the host name. The risk of having our whole
production web application go down if one host is not available is not worth
this convenient "feature" of mod_jk.  :-)

Re: JK 1.2.28 - load balancer worker fails on startup with one worker down ?

Posted by André Warnier <aw...@ice-sa.com>.
Scott Bradshaw wrote:
> Yep - you are right on.
> 
> The machine was taken off the network and moved to a test network for a few
> days. It currently does not resolve. If I change the host to its old IP
> address (which does not respond), the system starts up just fine.
> 
> I would expect to see an error in the log, but just because 1 host does not
> resolve, I wouldn't expect mod_jk to prevent the rest of the hosts from
> functioning. In the current configuration I just changed, the IP address I
> have now is not functioning and could be a configuration error, but mod_jk
> is still loading.
> 
> How do I go about submitting this as an enhancement request for the next
> version?
> 
Still continuing to guess..

This is about efficiency.
If mod_jk had to do a DNS lookup each time it wants to send a packet to 
a backend Tomcat (or at least each time it wants to create a new 
connection to a backend Tomcat), that would be very inefficient.

So, instead, mod_jk stores the IP address of each backend Tomcat, and 
during normal operation uses only that stored IP.

But, as a convenience, in the configuration file, it allows you to 
specify the worker's host as a name if you wish; and if you do that, it 
does the lookup once at startup, to resolve that to an IP which it stores.

But if it cannot at startup do this lookup and resolve the name to an 
IP, then it is stuck and cannot go further.
Because if it did go further, then it would have to store this 
workers'host as an unresolved name, and then it would have to do a 
lookup during normal operation.
CQFD.

Now, I have really no idea if the code is really like that, but if not 
at least it seems logical, doesn't it ?
;-)


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: JK 1.2.28 - load balancer worker fails on startup with one worker down ?

Posted by Scott Bradshaw <sw...@gmail.com>.
Yep - you are right on.

The machine was taken off the network and moved to a test network for a few
days. It currently does not resolve. If I change the host to its old IP
address (which does not respond), the system starts up just fine.

I would expect to see an error in the log, but just because 1 host does not
resolve, I wouldn't expect mod_jk to prevent the rest of the hosts from
functioning. In the current configuration I just changed, the IP address I
have now is not functioning and could be a configuration error, but mod_jk
is still loading.

How do I go about submitting this as an enhancement request for the next
version?

Thanks for your help!

Scott


On Wed, Apr 15, 2009 at 6:20 PM, André Warnier <aw...@ice-sa.com> wrote:

> If we just stick to the actual error message for a moment, and assume it
> means what it says :
> >> worker mygpgby06 can't resolve tomcat address mygpgby06.mycompany.com
> the first question would be : why can the DNS name "
> mygpgby06.mycompany.com" not be /resolved/ to an IP address when host
> "mygpgby06" (?) is down for maintenance ?
>
> Is there some kind of dynamic DNS system at work there ?
>
> What happens if you replace
> >> worker.mygpgby06.host=mygpgby06.mycompany.com
> by
> >> worker.mygpgby06.host=xxx.xxx.xxx.xxx
> (xxx.xxx.xxx.xxx being the actual IP address of that host)
>
> I'm just guessing here, but what if it is so that, at least at start, the
> load balancing members must at least be able to be resolved to an IP
> address, otherwise mod_jk determines that there's really something wrong
> with the configuration, and won't even start ?
>
>
>
>
>
> Scott Bradshaw wrote:
>
>> /portal/*=loadbalancerprod
>>
>> The uriworkermap.properties file is correct - workers are correctly sent
>> to
>> it assuming all the workers are accessible.
>>
>> The problem is when the workers in the load balancer are being
>> initialized,
>> if one worker is not available, the load balance worker is considered not
>> valid. Because its not valid, requests will not be sent to it. This does
>> not
>> seem to be the desired behavior of a load balancer.
>>
>> Scott
>>
>> On Wed, Apr 15, 2009 at 5:32 PM, Jorge Medina <jm...@e-dialog.com>
>> wrote:
>>
>>  Your workers.properties looks fine.
>>>
>>> What is the content of uriworkermap.proeprties ?
>>>
>>> -----Original Message-----
>>> From: swbradshaw@gmail.com [mailto:swbradshaw@gmail.com] On Behalf Of
>>> Scott Bradshaw
>>> Sent: Wednesday, April 15, 2009 3:51 PM
>>> To: users@tomcat.apache.org
>>> Subject: JK 1.2.28 - load balancer worker fails on startup with one
>>> worker down ?
>>>
>>> I apologize if this a silly question, but I can't figure it out! I've
>>> looked over the documentation and I'm stumped.
>>>
>>> I have 5 load balanced workers defined. I have them setup and configured
>>> correctly.
>>>
>>> workers.properties file (partial - not including all the individual
>>> workers)
>>> ---------------------------------------------------------------
>>> worker.mygpgby02.type=ajp13
>>> worker.mygpgby02.host=mygpgby02.mycompany.com
>>> worker.mygpgby02.port=8009
>>>
>>> worker.loadbalancerprod.type=lb
>>> worker.loadbalancerprod.balance_workers=mygpgby02,mygpgby03,mygpgby04,my
>>> gpgby05,mygpgby06
>>> worker.list=loadbalancerprod
>>> ---------------------------------------------------------------
>>> Now, here is the problem - one of those hosts(mygpgby06) is currently
>>> down for maintenence. Whenever I startup IIS, the ISAPI proxy won't
>>> work. The ISAPI log file shows this:  (please note mycompany.com is not
>>> the actual url
>>> - i changed it)
>>>
>>> [Wed Apr 15 14:22:00.463 2009] [4208:2848] [error] jk_ajp_common.c
>>> (2526):
>>> worker mygpgby06 can't resolve tomcat address mygpgby06.mycompany.com
>>> [Wed Apr 15 14:22:00.463 2009] [4208:2848] [error] jk_worker.c (163):
>>> validate failed for mygpgby06
>>> [Wed Apr 15 14:22:00.463 2009] [4208:2848] [error] jk_lb_worker.c
>>> (1599):
>>> Failed creating worker mygpgby06
>>> [Wed Apr 15 14:22:00.479 2009] [4208:2848] [error] jk_lb_worker.c
>>> (1647):
>>> NULL parameters
>>> [Wed Apr 15 14:22:00.479 2009] [4208:2848] [error] jk_worker.c (163):
>>> validate failed for loadbalancerprod
>>> [Wed Apr 15 14:22:00.479 2009] [4208:2848] [error] jk_worker.c (262):
>>> failed to create worker loadbalancerprod [Wed Apr 15 14:22:00.479 2009]
>>> [4208:2848] [error] jk_uri_worker_map.c
>>> (506): Could not find worker with name 'loadbalancerprod' in uri map
>>> post processing.
>>>
>>>  If I take this worker out of the balance_workers list, everything
>>> starts up fine. If I leave it in, my loadbalancerprod worker is
>>> completely dead..
>>>
>>>  According to the documentation, "When starting up, the web server
>>> plugin with instantiate the workers whose name appears in the
>>> worker.list property..".
>>>
>>> So - one worker in the load balancer won't start so the whole load
>>> balancer is considered a failed worker.
>>>
>>> Is there a property I'm missing to make this work ?
>>>
>>> Scott
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
>>> For additional commands, e-mail: users-help@tomcat.apache.org
>>>
>>>
>>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
>
>

Re: JK 1.2.28 - load balancer worker fails on startup with one worker down ?

Posted by André Warnier <aw...@ice-sa.com>.
If we just stick to the actual error message for a moment, and assume it 
means what it says :
 >> worker mygpgby06 can't resolve tomcat address mygpgby06.mycompany.com
the first question would be : why can the DNS name 
"mygpgby06.mycompany.com" not be /resolved/ to an IP address when host 
"mygpgby06" (?) is down for maintenance ?

Is there some kind of dynamic DNS system at work there ?

What happens if you replace
 >> worker.mygpgby06.host=mygpgby06.mycompany.com
by
 >> worker.mygpgby06.host=xxx.xxx.xxx.xxx
(xxx.xxx.xxx.xxx being the actual IP address of that host)

I'm just guessing here, but what if it is so that, at least at start, 
the load balancing members must at least be able to be resolved to an IP 
address, otherwise mod_jk determines that there's really something wrong 
with the configuration, and won't even start ?




Scott Bradshaw wrote:
> /portal/*=loadbalancerprod
> 
> The uriworkermap.properties file is correct - workers are correctly sent to
> it assuming all the workers are accessible.
> 
> The problem is when the workers in the load balancer are being initialized,
> if one worker is not available, the load balance worker is considered not
> valid. Because its not valid, requests will not be sent to it. This does not
> seem to be the desired behavior of a load balancer.
> 
> Scott
> 
> On Wed, Apr 15, 2009 at 5:32 PM, Jorge Medina <jm...@e-dialog.com> wrote:
> 
>> Your workers.properties looks fine.
>>
>> What is the content of uriworkermap.proeprties ?
>>
>> -----Original Message-----
>> From: swbradshaw@gmail.com [mailto:swbradshaw@gmail.com] On Behalf Of
>> Scott Bradshaw
>> Sent: Wednesday, April 15, 2009 3:51 PM
>> To: users@tomcat.apache.org
>> Subject: JK 1.2.28 - load balancer worker fails on startup with one
>> worker down ?
>>
>> I apologize if this a silly question, but I can't figure it out! I've
>> looked over the documentation and I'm stumped.
>>
>> I have 5 load balanced workers defined. I have them setup and configured
>> correctly.
>>
>> workers.properties file (partial - not including all the individual
>> workers)
>> ---------------------------------------------------------------
>> worker.mygpgby02.type=ajp13
>> worker.mygpgby02.host=mygpgby02.mycompany.com
>> worker.mygpgby02.port=8009
>>
>> worker.loadbalancerprod.type=lb
>> worker.loadbalancerprod.balance_workers=mygpgby02,mygpgby03,mygpgby04,my
>> gpgby05,mygpgby06
>> worker.list=loadbalancerprod
>> ---------------------------------------------------------------
>> Now, here is the problem - one of those hosts(mygpgby06) is currently
>> down for maintenence. Whenever I startup IIS, the ISAPI proxy won't
>> work. The ISAPI log file shows this:  (please note mycompany.com is not
>> the actual url
>> - i changed it)
>>
>> [Wed Apr 15 14:22:00.463 2009] [4208:2848] [error] jk_ajp_common.c
>> (2526):
>> worker mygpgby06 can't resolve tomcat address mygpgby06.mycompany.com
>> [Wed Apr 15 14:22:00.463 2009] [4208:2848] [error] jk_worker.c (163):
>> validate failed for mygpgby06
>> [Wed Apr 15 14:22:00.463 2009] [4208:2848] [error] jk_lb_worker.c
>> (1599):
>> Failed creating worker mygpgby06
>> [Wed Apr 15 14:22:00.479 2009] [4208:2848] [error] jk_lb_worker.c
>> (1647):
>> NULL parameters
>> [Wed Apr 15 14:22:00.479 2009] [4208:2848] [error] jk_worker.c (163):
>> validate failed for loadbalancerprod
>> [Wed Apr 15 14:22:00.479 2009] [4208:2848] [error] jk_worker.c (262):
>> failed to create worker loadbalancerprod [Wed Apr 15 14:22:00.479 2009]
>> [4208:2848] [error] jk_uri_worker_map.c
>> (506): Could not find worker with name 'loadbalancerprod' in uri map
>> post processing.
>>
>>  If I take this worker out of the balance_workers list, everything
>> starts up fine. If I leave it in, my loadbalancerprod worker is
>> completely dead..
>>
>>  According to the documentation, "When starting up, the web server
>> plugin with instantiate the workers whose name appears in the
>> worker.list property..".
>>
>> So - one worker in the load balancer won't start so the whole load
>> balancer is considered a failed worker.
>>
>> Is there a property I'm missing to make this work ?
>>
>> Scott
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
>> For additional commands, e-mail: users-help@tomcat.apache.org
>>
>>
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: JK 1.2.28 - load balancer worker fails on startup with one worker down ?

Posted by Scott Bradshaw <sw...@gmail.com>.
/portal/*=loadbalancerprod

The uriworkermap.properties file is correct - workers are correctly sent to
it assuming all the workers are accessible.

The problem is when the workers in the load balancer are being initialized,
if one worker is not available, the load balance worker is considered not
valid. Because its not valid, requests will not be sent to it. This does not
seem to be the desired behavior of a load balancer.

Scott

On Wed, Apr 15, 2009 at 5:32 PM, Jorge Medina <jm...@e-dialog.com> wrote:

> Your workers.properties looks fine.
>
> What is the content of uriworkermap.proeprties ?
>
> -----Original Message-----
> From: swbradshaw@gmail.com [mailto:swbradshaw@gmail.com] On Behalf Of
> Scott Bradshaw
> Sent: Wednesday, April 15, 2009 3:51 PM
> To: users@tomcat.apache.org
> Subject: JK 1.2.28 - load balancer worker fails on startup with one
> worker down ?
>
> I apologize if this a silly question, but I can't figure it out! I've
> looked over the documentation and I'm stumped.
>
> I have 5 load balanced workers defined. I have them setup and configured
> correctly.
>
> workers.properties file (partial - not including all the individual
> workers)
> ---------------------------------------------------------------
> worker.mygpgby02.type=ajp13
> worker.mygpgby02.host=mygpgby02.mycompany.com
> worker.mygpgby02.port=8009
>
> worker.loadbalancerprod.type=lb
> worker.loadbalancerprod.balance_workers=mygpgby02,mygpgby03,mygpgby04,my
> gpgby05,mygpgby06
> worker.list=loadbalancerprod
> ---------------------------------------------------------------
> Now, here is the problem - one of those hosts(mygpgby06) is currently
> down for maintenence. Whenever I startup IIS, the ISAPI proxy won't
> work. The ISAPI log file shows this:  (please note mycompany.com is not
> the actual url
> - i changed it)
>
> [Wed Apr 15 14:22:00.463 2009] [4208:2848] [error] jk_ajp_common.c
> (2526):
> worker mygpgby06 can't resolve tomcat address mygpgby06.mycompany.com
> [Wed Apr 15 14:22:00.463 2009] [4208:2848] [error] jk_worker.c (163):
> validate failed for mygpgby06
> [Wed Apr 15 14:22:00.463 2009] [4208:2848] [error] jk_lb_worker.c
> (1599):
> Failed creating worker mygpgby06
> [Wed Apr 15 14:22:00.479 2009] [4208:2848] [error] jk_lb_worker.c
> (1647):
> NULL parameters
> [Wed Apr 15 14:22:00.479 2009] [4208:2848] [error] jk_worker.c (163):
> validate failed for loadbalancerprod
> [Wed Apr 15 14:22:00.479 2009] [4208:2848] [error] jk_worker.c (262):
> failed to create worker loadbalancerprod [Wed Apr 15 14:22:00.479 2009]
> [4208:2848] [error] jk_uri_worker_map.c
> (506): Could not find worker with name 'loadbalancerprod' in uri map
> post processing.
>
>  If I take this worker out of the balance_workers list, everything
> starts up fine. If I leave it in, my loadbalancerprod worker is
> completely dead..
>
>  According to the documentation, "When starting up, the web server
> plugin with instantiate the workers whose name appears in the
> worker.list property..".
>
> So - one worker in the load balancer won't start so the whole load
> balancer is considered a failed worker.
>
> Is there a property I'm missing to make this work ?
>
> Scott
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
>
>

RE: JK 1.2.28 - load balancer worker fails on startup with one worker down ?

Posted by Jorge Medina <jm...@e-dialog.com>.
Your workers.properties looks fine.

What is the content of uriworkermap.proeprties ?

-----Original Message-----
From: swbradshaw@gmail.com [mailto:swbradshaw@gmail.com] On Behalf Of
Scott Bradshaw
Sent: Wednesday, April 15, 2009 3:51 PM
To: users@tomcat.apache.org
Subject: JK 1.2.28 - load balancer worker fails on startup with one
worker down ?

I apologize if this a silly question, but I can't figure it out! I've
looked over the documentation and I'm stumped.

I have 5 load balanced workers defined. I have them setup and configured
correctly.

workers.properties file (partial - not including all the individual
workers)
---------------------------------------------------------------
worker.mygpgby02.type=ajp13
worker.mygpgby02.host=mygpgby02.mycompany.com
worker.mygpgby02.port=8009

worker.loadbalancerprod.type=lb
worker.loadbalancerprod.balance_workers=mygpgby02,mygpgby03,mygpgby04,my
gpgby05,mygpgby06
worker.list=loadbalancerprod
---------------------------------------------------------------
Now, here is the problem - one of those hosts(mygpgby06) is currently
down for maintenence. Whenever I startup IIS, the ISAPI proxy won't
work. The ISAPI log file shows this:  (please note mycompany.com is not
the actual url
- i changed it)

[Wed Apr 15 14:22:00.463 2009] [4208:2848] [error] jk_ajp_common.c
(2526):
worker mygpgby06 can't resolve tomcat address mygpgby06.mycompany.com
[Wed Apr 15 14:22:00.463 2009] [4208:2848] [error] jk_worker.c (163):
validate failed for mygpgby06
[Wed Apr 15 14:22:00.463 2009] [4208:2848] [error] jk_lb_worker.c
(1599):
Failed creating worker mygpgby06
[Wed Apr 15 14:22:00.479 2009] [4208:2848] [error] jk_lb_worker.c
(1647):
NULL parameters
[Wed Apr 15 14:22:00.479 2009] [4208:2848] [error] jk_worker.c (163):
validate failed for loadbalancerprod
[Wed Apr 15 14:22:00.479 2009] [4208:2848] [error] jk_worker.c (262):
failed to create worker loadbalancerprod [Wed Apr 15 14:22:00.479 2009]
[4208:2848] [error] jk_uri_worker_map.c
(506): Could not find worker with name 'loadbalancerprod' in uri map
post processing.

 If I take this worker out of the balance_workers list, everything
starts up fine. If I leave it in, my loadbalancerprod worker is
completely dead..

 According to the documentation, "When starting up, the web server
plugin with instantiate the workers whose name appears in the
worker.list property..".

So - one worker in the load balancer won't start so the whole load
balancer is considered a failed worker.

Is there a property I'm missing to make this work ?

Scott

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org