You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Ramkumar R. Aiyengar" <an...@gmail.com> on 2015/02/26 19:52:11 UTC

reuseAddress default in Solr jetty.xml

The jetty.xml we currently ship by default doesn't set reuseAddress=true.
If you are having a bad GC day with things going OOM and resulting in Solr
not even being able to shutdown cleanly (or the oom_solr.sh script killing
it), whatever external service management mechanism you have is probably
going to try respawn it and fail with the default config because the ports
will be in TIME_WAIT. I guess there's the usual disclaimer with
reuseAddress causing stray packets to reach the restarted server, but
sounds like at least the default should be true..

I can raise a JIRA, but just wanted to check if anyone has any opinions
either way..

Re: reuseAddress default in Solr jetty.xml

Posted by Shawn Heisey <ap...@elyograg.org>.
On 3/2/2015 10:34 AM, Reitzel, Charles wrote:
>
> Hi Ram,
>
>  
>
> It appears the problem is that the old solr/jetty process is actually
> still running when the new solr/jetty process is started.   That’s the
> problem that needs fixing.
>
>  
>
> This is not a rare problem in systems with worker threads dedicated to
> different tasks.   These threads need to wake up in response to the
> shutdown signal/command, as well the normal inputs.
>
>  
>
> It’s a bug I’ve created and fixed a couple times over the years …
> :-)    I wouldn’t know where to start with Solr.  But, as I say,
> re-using the port is a band-aid.  I’ve yet to see a case where it is
> the best solution.
>

I can't say whether the lack of the reuse option on the stop port
binding is a real problem.  I can say that I've never had a problem with
my init script for the 4.x example jetty, which *DOES* use the STOPPORT
and STOPKEY options.  I know that there have been times when Solr has
been completely unresponsive and the init script has been forced to use
the -9 signal.  You can find this init script (in redhat and ubuntu
varieties) here:

http://wiki.apache.org/solr/ShawnHeisey

Thanks,
Shawn


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: reuseAddress default in Solr jetty.xml

Posted by Mark Miller <ma...@gmail.com>.
But all too often necessary :)

On Tue, Mar 3, 2015 at 12:14 AM Ramkumar R. Aiyengar <
andyetitmoves@gmail.com> wrote:

> I agree, sigkill is typically the last resort..
> On 3 Mar 2015 00:49, "Reitzel, Charles" <Ch...@tiaa-cref.org>
> wrote:
>
>>  My bad.  Too long away from sockets since cleaning up those shutdown
>> handlers.  Your point is well taken, on the server side the risks of
>> consuming a stray echo packet are fairly low (but non-zero, if you’ve ever
>> spent any quality time with tcpdump/wireshark).
>>
>>
>>
>> Still, in a production setting, SIGKILL (aka “kill -9”) should be a last
>> resort after more reasonable methods (e.g. SIGINT, SIGTERM, SIGSTOP) have
>> failed.
>>
>>
>>
>> *From:* Ramkumar R. Aiyengar [mailto:andyetitmoves@gmail.com]
>> *Sent:* Monday, March 02, 2015 7:00 PM
>> *To:* dev@lucene.apache.org
>> *Subject:* RE: reuseAddress default in Solr jetty.xml
>>
>>
>>
>> No, reuseAddress doesn't allow you to have two processes, old and new,
>> listen to the same port. There's no option which allows you to do that.
>>
>> Tl;DR This can happen when you have a connection to a server which gets
>> killed hard and comes back up immediately
>>
>> So here's what happens.
>>
>> When a server normally shuts down, it triggers an active close on all
>> open TCP connections it has. That sends a three way msg exchange with the
>> remote recipient (FIN, FIN+ACK, ACK) at the end of which the socket is
>> closed and the kernel puts it in a TIME_WAIT state for a few minutes in the
>> background (depends on the OS, maximum tends to be 4 mins). This is needed
>> to allow for reordered older packets to reach the machine just in case. Now
>> typically if the server restarts within that period and tries to bind again
>> to the same port, the kernel is smart enough to not complain that there is
>> an existing socket in TIME_WAIT, because it knows the last sequence number
>> it used for the final message in the previous process, and since sequence
>> numbers are always increasing, it can reject any messages before that
>> sequence number as a new process has now taken the port.
>>
>> Trouble is with abnormal shutdown. There's no time for a proper goodbye,
>> so the kernel marks the socket to respond to remote packets with a rude RST
>> (reset). Since there has been no goodbye with the remote end, it also
>> doesn't know the last sequence number to delineate if a new process binds
>> to the same port. Hence by default it denies binding to the new port for
>> the TIME_WAIT period to avoid the off chance a stray packet gets picked up
>> by the new process and utterly confuses it. By setting reuseAddress, you
>> are essentially waiving off this protection. Note that this possibility of
>> confusion is unbelievably miniscule in the first place (both the source and
>> destination host:port should be the same and the client port is generally
>> randomly allocated). If the port we are talking of is a local port, it's
>> almost impossible -- you have bigger problems if a TCP packet is lost or
>> delayed within the same machine!
>>
>> As to Shawn's point, for Solr's stop port, you essentially need to be
>> trying to actively shutdown the server using the stop port, or be within a
>> few minutes of such an attempt while the server is killed. Just the server
>> being killed without any active connection to it is not going to cause this
>> issue.
>>
>> Hi Ram,
>>
>>
>>
>> It appears the problem is that the old solr/jetty process is actually
>> still running when the new solr/jetty process is started.   That’s the
>> problem that needs fixing.
>>
>>
>>
>> This is not a rare problem in systems with worker threads dedicated to
>> different tasks.   These threads need to wake up in response to the
>> shutdown signal/command, as well the normal inputs.
>>
>>
>>
>> It’s a bug I’ve created and fixed a couple times over the years … :-)
>> I wouldn’t know where to start with Solr.  But, as I say, re-using the port
>> is a band-aid.  I’ve yet to see a case where it is the best solution.
>>
>>
>>
>> best,
>>
>> Charlie
>>
>>
>>
>> *From:* Ramkumar R. Aiyengar [mailto:andyetitmoves@gmail.com]
>> *Sent:* Saturday, February 28, 2015 8:15 PM
>> *To:* dev@lucene.apache.org
>> *Subject:* Re: reuseAddress default in Solr jetty.xml
>>
>>
>>
>> Hey Charles, see my explanation above on why this is needed. If Solr has
>> to be killed, it would generally be immediately restarted. This would
>> normally not the case, except when things are potentially misconfigured or
>> if there is a bug, but not doing so makes the impact worse..
>>
>> In any case, turns out really that reuseAddress is true by default for
>> the connectors we use, so that really isn't the issue. The issue more
>> specifically is that the stop port doesn't do it, so the actual port by
>> itself starts just fine on a restart, but the stop port fails to bind --
>> and there's no way currently in Jetty to configure that.
>>
>> Based on my question in the jetty mailing list, I have now created an
>> issue for them..
>>
>> https://bugs.eclipse.org/bugs/show_bug.cgi?id=461133
>>
>>
>>
>> On Fri, Feb 27, 2015 at 3:03 PM, Reitzel, Charles <
>> Charles.Reitzel@tiaa-cref.org> wrote:
>>
>> Disclaimer: I’m not a Solr committer.  But, as a developer, I’ve never
>> seen a good case for reusing the listening port.   Better to find and fix
>> the root cause on the zombie state (or just slow shutdown, sometimes) and
>> release the port.
>>
>>
>>
>> *From:* Mark Miller [mailto:markrmiller@gmail.com]
>> *Sent:* Thursday, February 26, 2015 5:28 PM
>> *To:* dev@lucene.apache.org
>> *Subject:* Re: reuseAddress default in Solr jetty.xml
>>
>>
>>
>> +1
>>
>> - Mark
>>
>>
>>
>> On Thu, Feb 26, 2015 at 1:54 PM Ramkumar R. Aiyengar <
>> andyetitmoves@gmail.com> wrote:
>>
>> The jetty.xml we currently ship by default doesn't set reuseAddress=true.
>> If you are having a bad GC day with things going OOM and resulting in Solr
>> not even being able to shutdown cleanly (or the oom_solr.sh script killing
>> it), whatever external service management mechanism you have is probably
>> going to try respawn it and fail with the default config because the ports
>> will be in TIME_WAIT. I guess there's the usual disclaimer with
>> reuseAddress causing stray packets to reach the restarted server, but
>> sounds like at least the default should be true..
>>
>> I can raise a JIRA, but just wanted to check if anyone has any opinions
>> either way..
>>
>>
>>
>>
>> *************************************************************************
>> This e-mail may contain confidential or privileged information.
>> If you are not the intended recipient, please notify the sender
>> immediately and then delete it.
>>
>> TIAA-CREF
>> *************************************************************************
>>
>>
>>
>>
>> --
>>
>> Not sent from my iPhone or my Blackberry or anyone else's
>>
>>
>> *************************************************************************
>> This e-mail may contain confidential or privileged information.
>> If you are not the intended recipient, please notify the sender
>> immediately and then delete it.
>>
>> TIAA-CREF
>> *************************************************************************
>>
>>
>> *************************************************************************
>> This e-mail may contain confidential or privileged information.
>> If you are not the intended recipient, please notify the sender
>> immediately and then delete it.
>>
>> TIAA-CREF
>> *************************************************************************
>>
>

RE: reuseAddress default in Solr jetty.xml

Posted by "Ramkumar R. Aiyengar" <an...@gmail.com>.
I agree, sigkill is typically the last resort..
On 3 Mar 2015 00:49, "Reitzel, Charles" <Ch...@tiaa-cref.org>
wrote:

>  My bad.  Too long away from sockets since cleaning up those shutdown
> handlers.  Your point is well taken, on the server side the risks of
> consuming a stray echo packet are fairly low (but non-zero, if you’ve ever
> spent any quality time with tcpdump/wireshark).
>
>
>
> Still, in a production setting, SIGKILL (aka “kill -9”) should be a last
> resort after more reasonable methods (e.g. SIGINT, SIGTERM, SIGSTOP) have
> failed.
>
>
>
> *From:* Ramkumar R. Aiyengar [mailto:andyetitmoves@gmail.com]
> *Sent:* Monday, March 02, 2015 7:00 PM
> *To:* dev@lucene.apache.org
> *Subject:* RE: reuseAddress default in Solr jetty.xml
>
>
>
> No, reuseAddress doesn't allow you to have two processes, old and new,
> listen to the same port. There's no option which allows you to do that.
>
> Tl;DR This can happen when you have a connection to a server which gets
> killed hard and comes back up immediately
>
> So here's what happens.
>
> When a server normally shuts down, it triggers an active close on all open
> TCP connections it has. That sends a three way msg exchange with the remote
> recipient (FIN, FIN+ACK, ACK) at the end of which the socket is closed and
> the kernel puts it in a TIME_WAIT state for a few minutes in the background
> (depends on the OS, maximum tends to be 4 mins). This is needed to allow
> for reordered older packets to reach the machine just in case. Now
> typically if the server restarts within that period and tries to bind again
> to the same port, the kernel is smart enough to not complain that there is
> an existing socket in TIME_WAIT, because it knows the last sequence number
> it used for the final message in the previous process, and since sequence
> numbers are always increasing, it can reject any messages before that
> sequence number as a new process has now taken the port.
>
> Trouble is with abnormal shutdown. There's no time for a proper goodbye,
> so the kernel marks the socket to respond to remote packets with a rude RST
> (reset). Since there has been no goodbye with the remote end, it also
> doesn't know the last sequence number to delineate if a new process binds
> to the same port. Hence by default it denies binding to the new port for
> the TIME_WAIT period to avoid the off chance a stray packet gets picked up
> by the new process and utterly confuses it. By setting reuseAddress, you
> are essentially waiving off this protection. Note that this possibility of
> confusion is unbelievably miniscule in the first place (both the source and
> destination host:port should be the same and the client port is generally
> randomly allocated). If the port we are talking of is a local port, it's
> almost impossible -- you have bigger problems if a TCP packet is lost or
> delayed within the same machine!
>
> As to Shawn's point, for Solr's stop port, you essentially need to be
> trying to actively shutdown the server using the stop port, or be within a
> few minutes of such an attempt while the server is killed. Just the server
> being killed without any active connection to it is not going to cause this
> issue.
>
> Hi Ram,
>
>
>
> It appears the problem is that the old solr/jetty process is actually
> still running when the new solr/jetty process is started.   That’s the
> problem that needs fixing.
>
>
>
> This is not a rare problem in systems with worker threads dedicated to
> different tasks.   These threads need to wake up in response to the
> shutdown signal/command, as well the normal inputs.
>
>
>
> It’s a bug I’ve created and fixed a couple times over the years … :-)    I
> wouldn’t know where to start with Solr.  But, as I say, re-using the port
> is a band-aid.  I’ve yet to see a case where it is the best solution.
>
>
>
> best,
>
> Charlie
>
>
>
> *From:* Ramkumar R. Aiyengar [mailto:andyetitmoves@gmail.com]
> *Sent:* Saturday, February 28, 2015 8:15 PM
> *To:* dev@lucene.apache.org
> *Subject:* Re: reuseAddress default in Solr jetty.xml
>
>
>
> Hey Charles, see my explanation above on why this is needed. If Solr has
> to be killed, it would generally be immediately restarted. This would
> normally not the case, except when things are potentially misconfigured or
> if there is a bug, but not doing so makes the impact worse..
>
> In any case, turns out really that reuseAddress is true by default for the
> connectors we use, so that really isn't the issue. The issue more
> specifically is that the stop port doesn't do it, so the actual port by
> itself starts just fine on a restart, but the stop port fails to bind --
> and there's no way currently in Jetty to configure that.
>
> Based on my question in the jetty mailing list, I have now created an
> issue for them..
>
> https://bugs.eclipse.org/bugs/show_bug.cgi?id=461133
>
>
>
> On Fri, Feb 27, 2015 at 3:03 PM, Reitzel, Charles <
> Charles.Reitzel@tiaa-cref.org> wrote:
>
> Disclaimer: I’m not a Solr committer.  But, as a developer, I’ve never
> seen a good case for reusing the listening port.   Better to find and fix
> the root cause on the zombie state (or just slow shutdown, sometimes) and
> release the port.
>
>
>
> *From:* Mark Miller [mailto:markrmiller@gmail.com]
> *Sent:* Thursday, February 26, 2015 5:28 PM
> *To:* dev@lucene.apache.org
> *Subject:* Re: reuseAddress default in Solr jetty.xml
>
>
>
> +1
>
> - Mark
>
>
>
> On Thu, Feb 26, 2015 at 1:54 PM Ramkumar R. Aiyengar <
> andyetitmoves@gmail.com> wrote:
>
> The jetty.xml we currently ship by default doesn't set reuseAddress=true.
> If you are having a bad GC day with things going OOM and resulting in Solr
> not even being able to shutdown cleanly (or the oom_solr.sh script killing
> it), whatever external service management mechanism you have is probably
> going to try respawn it and fail with the default config because the ports
> will be in TIME_WAIT. I guess there's the usual disclaimer with
> reuseAddress causing stray packets to reach the restarted server, but
> sounds like at least the default should be true..
>
> I can raise a JIRA, but just wanted to check if anyone has any opinions
> either way..
>
>
>
>
> *************************************************************************
> This e-mail may contain confidential or privileged information.
> If you are not the intended recipient, please notify the sender
> immediately and then delete it.
>
> TIAA-CREF
> *************************************************************************
>
>
>
>
> --
>
> Not sent from my iPhone or my Blackberry or anyone else's
>
>
> *************************************************************************
> This e-mail may contain confidential or privileged information.
> If you are not the intended recipient, please notify the sender
> immediately and then delete it.
>
> TIAA-CREF
> *************************************************************************
>
>
> *************************************************************************
> This e-mail may contain confidential or privileged information.
> If you are not the intended recipient, please notify the sender
> immediately and then delete it.
>
> TIAA-CREF
> *************************************************************************
>

RE: reuseAddress default in Solr jetty.xml

Posted by "Reitzel, Charles" <Ch...@tiaa-cref.org>.
My bad.  Too long away from sockets since cleaning up those shutdown handlers.  Your point is well taken, on the server side the risks of consuming a stray echo packet are fairly low (but non-zero, if you’ve ever spent any quality time with tcpdump/wireshark).

Still, in a production setting, SIGKILL (aka “kill -9”) should be a last resort after more reasonable methods (e.g. SIGINT, SIGTERM, SIGSTOP) have failed.

From: Ramkumar R. Aiyengar [mailto:andyetitmoves@gmail.com]
Sent: Monday, March 02, 2015 7:00 PM
To: dev@lucene.apache.org
Subject: RE: reuseAddress default in Solr jetty.xml


No, reuseAddress doesn't allow you to have two processes, old and new, listen to the same port. There's no option which allows you to do that.

Tl;DR This can happen when you have a connection to a server which gets killed hard and comes back up immediately

So here's what happens.

When a server normally shuts down, it triggers an active close on all open TCP connections it has. That sends a three way msg exchange with the remote recipient (FIN, FIN+ACK, ACK) at the end of which the socket is closed and the kernel puts it in a TIME_WAIT state for a few minutes in the background (depends on the OS, maximum tends to be 4 mins). This is needed to allow for reordered older packets to reach the machine just in case. Now typically if the server restarts within that period and tries to bind again to the same port, the kernel is smart enough to not complain that there is an existing socket in TIME_WAIT, because it knows the last sequence number it used for the final message in the previous process, and since sequence numbers are always increasing, it can reject any messages before that sequence number as a new process has now taken the port.

Trouble is with abnormal shutdown. There's no time for a proper goodbye, so the kernel marks the socket to respond to remote packets with a rude RST (reset). Since there has been no goodbye with the remote end, it also doesn't know the last sequence number to delineate if a new process binds to the same port. Hence by default it denies binding to the new port for the TIME_WAIT period to avoid the off chance a stray packet gets picked up by the new process and utterly confuses it. By setting reuseAddress, you are essentially waiving off this protection. Note that this possibility of confusion is unbelievably miniscule in the first place (both the source and destination host:port should be the same and the client port is generally randomly allocated). If the port we are talking of is a local port, it's almost impossible -- you have bigger problems if a TCP packet is lost or delayed within the same machine!

As to Shawn's point, for Solr's stop port, you essentially need to be trying to actively shutdown the server using the stop port, or be within a few minutes of such an attempt while the server is killed. Just the server being killed without any active connection to it is not going to cause this issue.
Hi Ram,

It appears the problem is that the old solr/jetty process is actually still running when the new solr/jetty process is started.   That’s the problem that needs fixing.

This is not a rare problem in systems with worker threads dedicated to different tasks.   These threads need to wake up in response to the shutdown signal/command, as well the normal inputs.

It’s a bug I’ve created and fixed a couple times over the years … :-)    I wouldn’t know where to start with Solr.  But, as I say, re-using the port is a band-aid.  I’ve yet to see a case where it is the best solution.

best,
Charlie

From: Ramkumar R. Aiyengar [mailto:andyetitmoves@gmail.com<ma...@gmail.com>]
Sent: Saturday, February 28, 2015 8:15 PM
To: dev@lucene.apache.org<ma...@lucene.apache.org>
Subject: Re: reuseAddress default in Solr jetty.xml

Hey Charles, see my explanation above on why this is needed. If Solr has to be killed, it would generally be immediately restarted. This would normally not the case, except when things are potentially misconfigured or if there is a bug, but not doing so makes the impact worse..
In any case, turns out really that reuseAddress is true by default for the connectors we use, so that really isn't the issue. The issue more specifically is that the stop port doesn't do it, so the actual port by itself starts just fine on a restart, but the stop port fails to bind -- and there's no way currently in Jetty to configure that.
Based on my question in the jetty mailing list, I have now created an issue for them..

https://bugs.eclipse.org/bugs/show_bug.cgi?id=461133

On Fri, Feb 27, 2015 at 3:03 PM, Reitzel, Charles <Ch...@tiaa-cref.org>> wrote:
Disclaimer: I’m not a Solr committer.  But, as a developer, I’ve never seen a good case for reusing the listening port.   Better to find and fix the root cause on the zombie state (or just slow shutdown, sometimes) and release the port.

From: Mark Miller [mailto:markrmiller@gmail.com<ma...@gmail.com>]
Sent: Thursday, February 26, 2015 5:28 PM
To: dev@lucene.apache.org<ma...@lucene.apache.org>
Subject: Re: reuseAddress default in Solr jetty.xml

+1

- Mark

On Thu, Feb 26, 2015 at 1:54 PM Ramkumar R. Aiyengar <an...@gmail.com>> wrote:
The jetty.xml we currently ship by default doesn't set reuseAddress=true. If you are having a bad GC day with things going OOM and resulting in Solr not even being able to shutdown cleanly (or the oom_solr.sh script killing it), whatever external service management mechanism you have is probably going to try respawn it and fail with the default config because the ports will be in TIME_WAIT. I guess there's the usual disclaimer with reuseAddress causing stray packets to reach the restarted server, but sounds like at least the default should be true..

I can raise a JIRA, but just wanted to check if anyone has any opinions either way..


*************************************************************************
This e-mail may contain confidential or privileged information.
If you are not the intended recipient, please notify the sender immediately and then delete it.

TIAA-CREF
*************************************************************************



--
Not sent from my iPhone or my Blackberry or anyone else's

*************************************************************************
This e-mail may contain confidential or privileged information.
If you are not the intended recipient, please notify the sender immediately and then delete it.

TIAA-CREF
*************************************************************************

*************************************************************************
This e-mail may contain confidential or privileged information.
If you are not the intended recipient, please notify the sender immediately and then delete it.

TIAA-CREF
*************************************************************************

RE: reuseAddress default in Solr jetty.xml

Posted by "Ramkumar R. Aiyengar" <an...@gmail.com>.
No, reuseAddress doesn't allow you to have two processes, old and new,
listen to the same port. There's no option which allows you to do that.

Tl;DR This can happen when you have a connection to a server which gets
killed hard and comes back up immediately

So here's what happens.

When a server normally shuts down, it triggers an active close on all open
TCP connections it has. That sends a three way msg exchange with the remote
recipient (FIN, FIN+ACK, ACK) at the end of which the socket is closed and
the kernel puts it in a TIME_WAIT state for a few minutes in the background
(depends on the OS, maximum tends to be 4 mins). This is needed to allow
for reordered older packets to reach the machine just in case. Now
typically if the server restarts within that period and tries to bind again
to the same port, the kernel is smart enough to not complain that there is
an existing socket in TIME_WAIT, because it knows the last sequence number
it used for the final message in the previous process, and since sequence
numbers are always increasing, it can reject any messages before that
sequence number as a new process has now taken the port.

Trouble is with abnormal shutdown. There's no time for a proper goodbye, so
the kernel marks the socket to respond to remote packets with a rude RST
(reset). Since there has been no goodbye with the remote end, it also
doesn't know the last sequence number to delineate if a new process binds
to the same port. Hence by default it denies binding to the new port for
the TIME_WAIT period to avoid the off chance a stray packet gets picked up
by the new process and utterly confuses it. By setting reuseAddress, you
are essentially waiving off this protection. Note that this possibility of
confusion is unbelievably miniscule in the first place (both the source and
destination host:port should be the same and the client port is generally
randomly allocated). If the port we are talking of is a local port, it's
almost impossible -- you have bigger problems if a TCP packet is lost or
delayed within the same machine!

As to Shawn's point, for Solr's stop port, you essentially need to be
trying to actively shutdown the server using the stop port, or be within a
few minutes of such an attempt while the server is killed. Just the server
being killed without any active connection to it is not going to cause this
issue.

Hi Ram,



It appears the problem is that the old solr/jetty process is actually still
running when the new solr/jetty process is started.   That’s the problem
that needs fixing.



This is not a rare problem in systems with worker threads dedicated to
different tasks.   These threads need to wake up in response to the
shutdown signal/command, as well the normal inputs.



It’s a bug I’ve created and fixed a couple times over the years … :-)    I
wouldn’t know where to start with Solr.  But, as I say, re-using the port
is a band-aid.  I’ve yet to see a case where it is the best solution.



best,

Charlie



*From:* Ramkumar R. Aiyengar [mailto:andyetitmoves@gmail.com]
*Sent:* Saturday, February 28, 2015 8:15 PM
*To:* dev@lucene.apache.org
*Subject:* Re: reuseAddress default in Solr jetty.xml



Hey Charles, see my explanation above on why this is needed. If Solr has to
be killed, it would generally be immediately restarted. This would normally
not the case, except when things are potentially misconfigured or if there
is a bug, but not doing so makes the impact worse..

In any case, turns out really that reuseAddress is true by default for the
connectors we use, so that really isn't the issue. The issue more
specifically is that the stop port doesn't do it, so the actual port by
itself starts just fine on a restart, but the stop port fails to bind --
and there's no way currently in Jetty to configure that.

Based on my question in the jetty mailing list, I have now created an issue
for them..

https://bugs.eclipse.org/bugs/show_bug.cgi?id=461133



On Fri, Feb 27, 2015 at 3:03 PM, Reitzel, Charles <
Charles.Reitzel@tiaa-cref.org> wrote:

Disclaimer: I’m not a Solr committer.  But, as a developer, I’ve never seen
a good case for reusing the listening port.   Better to find and fix the
root cause on the zombie state (or just slow shutdown, sometimes) and
release the port.



*From:* Mark Miller [mailto:markrmiller@gmail.com]
*Sent:* Thursday, February 26, 2015 5:28 PM
*To:* dev@lucene.apache.org
*Subject:* Re: reuseAddress default in Solr jetty.xml



+1

- Mark



On Thu, Feb 26, 2015 at 1:54 PM Ramkumar R. Aiyengar <
andyetitmoves@gmail.com> wrote:

The jetty.xml we currently ship by default doesn't set reuseAddress=true.
If you are having a bad GC day with things going OOM and resulting in Solr
not even being able to shutdown cleanly (or the oom_solr.sh script killing
it), whatever external service management mechanism you have is probably
going to try respawn it and fail with the default config because the ports
will be in TIME_WAIT. I guess there's the usual disclaimer with
reuseAddress causing stray packets to reach the restarted server, but
sounds like at least the default should be true..

I can raise a JIRA, but just wanted to check if anyone has any opinions
either way..




*************************************************************************
This e-mail may contain confidential or privileged information.
If you are not the intended recipient, please notify the sender immediately
and then delete it.

TIAA-CREF
*************************************************************************




-- 

Not sent from my iPhone or my Blackberry or anyone else's


*************************************************************************
This e-mail may contain confidential or privileged information.
If you are not the intended recipient, please notify the sender immediately
and then delete it.

TIAA-CREF
*************************************************************************

RE: reuseAddress default in Solr jetty.xml

Posted by "Reitzel, Charles" <Ch...@tiaa-cref.org>.
Hi Ram,

It appears the problem is that the old solr/jetty process is actually still running when the new solr/jetty process is started.   That’s the problem that needs fixing.

This is not a rare problem in systems with worker threads dedicated to different tasks.   These threads need to wake up in response to the shutdown signal/command, as well the normal inputs.

It’s a bug I’ve created and fixed a couple times over the years … :-)    I wouldn’t know where to start with Solr.  But, as I say, re-using the port is a band-aid.  I’ve yet to see a case where it is the best solution.

best,
Charlie

From: Ramkumar R. Aiyengar [mailto:andyetitmoves@gmail.com]
Sent: Saturday, February 28, 2015 8:15 PM
To: dev@lucene.apache.org
Subject: Re: reuseAddress default in Solr jetty.xml

Hey Charles, see my explanation above on why this is needed. If Solr has to be killed, it would generally be immediately restarted. This would normally not the case, except when things are potentially misconfigured or if there is a bug, but not doing so makes the impact worse..
In any case, turns out really that reuseAddress is true by default for the connectors we use, so that really isn't the issue. The issue more specifically is that the stop port doesn't do it, so the actual port by itself starts just fine on a restart, but the stop port fails to bind -- and there's no way currently in Jetty to configure that.
Based on my question in the jetty mailing list, I have now created an issue for them..

https://bugs.eclipse.org/bugs/show_bug.cgi?id=461133

On Fri, Feb 27, 2015 at 3:03 PM, Reitzel, Charles <Ch...@tiaa-cref.org>> wrote:
Disclaimer: I’m not a Solr committer.  But, as a developer, I’ve never seen a good case for reusing the listening port.   Better to find and fix the root cause on the zombie state (or just slow shutdown, sometimes) and release the port.

From: Mark Miller [mailto:markrmiller@gmail.com<ma...@gmail.com>]
Sent: Thursday, February 26, 2015 5:28 PM
To: dev@lucene.apache.org<ma...@lucene.apache.org>
Subject: Re: reuseAddress default in Solr jetty.xml

+1

- Mark

On Thu, Feb 26, 2015 at 1:54 PM Ramkumar R. Aiyengar <an...@gmail.com>> wrote:
The jetty.xml we currently ship by default doesn't set reuseAddress=true. If you are having a bad GC day with things going OOM and resulting in Solr not even being able to shutdown cleanly (or the oom_solr.sh script killing it), whatever external service management mechanism you have is probably going to try respawn it and fail with the default config because the ports will be in TIME_WAIT. I guess there's the usual disclaimer with reuseAddress causing stray packets to reach the restarted server, but sounds like at least the default should be true..

I can raise a JIRA, but just wanted to check if anyone has any opinions either way..


*************************************************************************
This e-mail may contain confidential or privileged information.
If you are not the intended recipient, please notify the sender immediately and then delete it.

TIAA-CREF
*************************************************************************



--
Not sent from my iPhone or my Blackberry or anyone else's

*************************************************************************
This e-mail may contain confidential or privileged information.
If you are not the intended recipient, please notify the sender immediately and then delete it.

TIAA-CREF
*************************************************************************

Re: reuseAddress default in Solr jetty.xml

Posted by "Ramkumar R. Aiyengar" <an...@gmail.com>.
Hey Charles, see my explanation above on why this is needed. If Solr has to
be killed, it would generally be immediately restarted. This would normally
not the case, except when things are potentially misconfigured or if there
is a bug, but not doing so makes the impact worse..

In any case, turns out really that reuseAddress is true by default for the
connectors we use, so that really isn't the issue. The issue more
specifically is that the stop port doesn't do it, so the actual port by
itself starts just fine on a restart, but the stop port fails to bind --
and there's no way currently in Jetty to configure that.

Based on my question in the jetty mailing list, I have now created an issue
for them..

https://bugs.eclipse.org/bugs/show_bug.cgi?id=461133


On Fri, Feb 27, 2015 at 3:03 PM, Reitzel, Charles <
Charles.Reitzel@tiaa-cref.org> wrote:

>  Disclaimer: I’m not a Solr committer.  But, as a developer, I’ve never
> seen a good case for reusing the listening port.   Better to find and fix
> the root cause on the zombie state (or just slow shutdown, sometimes) and
> release the port.
>
>
>
> *From:* Mark Miller [mailto:markrmiller@gmail.com]
> *Sent:* Thursday, February 26, 2015 5:28 PM
> *To:* dev@lucene.apache.org
> *Subject:* Re: reuseAddress default in Solr jetty.xml
>
>
>
> +1
>
> - Mark
>
>
>
> On Thu, Feb 26, 2015 at 1:54 PM Ramkumar R. Aiyengar <
> andyetitmoves@gmail.com> wrote:
>
> The jetty.xml we currently ship by default doesn't set reuseAddress=true.
> If you are having a bad GC day with things going OOM and resulting in Solr
> not even being able to shutdown cleanly (or the oom_solr.sh script killing
> it), whatever external service management mechanism you have is probably
> going to try respawn it and fail with the default config because the ports
> will be in TIME_WAIT. I guess there's the usual disclaimer with
> reuseAddress causing stray packets to reach the restarted server, but
> sounds like at least the default should be true..
>
> I can raise a JIRA, but just wanted to check if anyone has any opinions
> either way..
>
>
>
>
> *************************************************************************
> This e-mail may contain confidential or privileged information.
> If you are not the intended recipient, please notify the sender
> immediately and then delete it.
>
> TIAA-CREF
> *************************************************************************
>



-- 
Not sent from my iPhone or my Blackberry or anyone else's

RE: reuseAddress default in Solr jetty.xml

Posted by "Reitzel, Charles" <Ch...@tiaa-cref.org>.
Disclaimer: I’m not a Solr committer.  But, as a developer, I’ve never seen a good case for reusing the listening port.   Better to find and fix the root cause on the zombie state (or just slow shutdown, sometimes) and release the port.

From: Mark Miller [mailto:markrmiller@gmail.com]
Sent: Thursday, February 26, 2015 5:28 PM
To: dev@lucene.apache.org
Subject: Re: reuseAddress default in Solr jetty.xml

+1

- Mark

On Thu, Feb 26, 2015 at 1:54 PM Ramkumar R. Aiyengar <an...@gmail.com>> wrote:
The jetty.xml we currently ship by default doesn't set reuseAddress=true. If you are having a bad GC day with things going OOM and resulting in Solr not even being able to shutdown cleanly (or the oom_solr.sh script killing it), whatever external service management mechanism you have is probably going to try respawn it and fail with the default config because the ports will be in TIME_WAIT. I guess there's the usual disclaimer with reuseAddress causing stray packets to reach the restarted server, but sounds like at least the default should be true..

I can raise a JIRA, but just wanted to check if anyone has any opinions either way..


*************************************************************************
This e-mail may contain confidential or privileged information.
If you are not the intended recipient, please notify the sender immediately and then delete it.

TIAA-CREF
*************************************************************************

Re: reuseAddress default in Solr jetty.xml

Posted by Mark Miller <ma...@gmail.com>.
+1

- Mark

On Thu, Feb 26, 2015 at 1:54 PM Ramkumar R. Aiyengar <
andyetitmoves@gmail.com> wrote:

> The jetty.xml we currently ship by default doesn't set reuseAddress=true.
> If you are having a bad GC day with things going OOM and resulting in Solr
> not even being able to shutdown cleanly (or the oom_solr.sh script killing
> it), whatever external service management mechanism you have is probably
> going to try respawn it and fail with the default config because the ports
> will be in TIME_WAIT. I guess there's the usual disclaimer with
> reuseAddress causing stray packets to reach the restarted server, but
> sounds like at least the default should be true..
>
> I can raise a JIRA, but just wanted to check if anyone has any opinions
> either way..
>
>