You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@activemq.apache.org by "James A. Robinson" <ji...@gmail.com> on 2016/03/18 14:04:00 UTC

NFS v4 locks "given up" w/o any logging?

Is it common that an activemq broker might give up its NFS v4 lock w/o
logging any sort of message?  I've got two brokers that logged this:

broker-a which held the log:
2016-03-17 15:01:51,113 [yMonitor Worker] WARN  Transport
   - Transport Connection to: tcp://104.232.16.4:62269 failed:
org.apache.activemq.transport.InactivityIOException: Channel was inactive
for too (>30000) long: tcp://xxx.xxx.xxx.xxx:62269
2016-03-18 00:05:22,751 [KeepAlive Timer] INFO  LockFile
    - Lock file /var/log/activemq/activemq-data/amq-dev-1/lock, locked at
Thu Mar 17 13:38:33 PDT 2016, has been modified at Fri Mar 18 00:02:15 PDT
2016
2016-03-18 00:05:22,758 [KeepAlive Timer] ERROR LockableServiceSupport
    - amq-dev-1, no longer able to keep the exclusive lock so giving up
being a master
2016-03-18 00:05:22,761 [KeepAlive Timer] INFO  BrokerService
   - Apache ActiveMQ 5.13.2 (amq-dev-1, ID:cluster-51079-1458247119790-1:1)
is shutting down

broker-b which appeared to steal the lock:
2016-03-17 13:38:52,680 [JMX connector  ] INFO  ManagementContext
   - JMX consoles can connect to
service:jmx:rmi://localhost:2020/jndi/rmi://localhost:2020/jmxrmi
2016-03-18 00:02:23,593 [erSimpleAppMain] INFO  MessageDatabase
   - KahaDB is version 6
2016-03-18 00:02:23,762 [erSimpleAppMain] INFO  MessageDatabase
   - Recovering from the journal @1:63912
2016-03-18 00:02:24,043 [erSimpleAppMain] INFO  MessageDatabase
   - Recovery replayed 7130 operations from the journal in 0.296 seconds.
2016-03-18 00:02:24,058 [erSimpleAppMain] INFO  PListStoreImpl
    - PListStore:[/var/log/activemq/activemq-data/amq-dev-1/tmp_storage]
started

Re: NFS v4 locks "given up" w/o any logging?

Posted by "James A. Robinson" <ji...@highwire.org>.

To close the loop on this for anyone that digs it up later...

While the issue of lock handoff is still an open one, and one that I think
may require  a new Locker scheme, the underlying cause of the issue I was
seeing with my activemq cluster was due to a difference in understanding on
how the network was supposed to be configured on the NFS server side.

I had isolated the symptom down to the fact that writes of over 16 MiB
caused the NFS layer to lock up when the NFS server admins corrected the
network routes on the server side, and I'm confident this has resolved the
immediate issues I was seeing.

Jim

On Wed, Mar 23, 2016 at 7:21 AM James A. Robinson <ji...@highwire.org> wrote:

> Thank you for the information.  Would you be able to tell me if your
> NetApp settings for the lock lease are the standard values?  I'm told that
> it's 30 seconds,  I was thinking about whether or not I needed to have that
> raised much higher to prevent the particular issue I'm seeing (the tcp
> layer trying to send SYN packets for 3 minutes before giving up)?
>
> On Tue, Mar 22, 2016 at 5:27 PM Paul Gale <pa...@gmail.com> wrote:
>
>> We're using the following with our NetApp device:
>>
>> -rw,soft,intr,noexec,nosuid,nodev,users,rsize=65535,wsize=65535,proto=tcp
>>
>> Whilst 'soft' is generally not recommended I've found that using 'hard'
>> causes the broker to lock up immediately.
>> The settings we have in place above were chosen by our storage team.
>>
>> Thanks,
>> Paul
>>
>

Re: NFS v4 locks "given up" w/o any logging?

Posted by "James A. Robinson" <ji...@highwire.org>.

Thank you for the information.  Would you be able to tell me if your NetApp
settings for the lock lease are the standard values?  I'm told that it's 30
seconds,  I was thinking about whether or not I needed to have that raised
much higher to prevent the particular issue I'm seeing (the tcp layer
trying to send SYN packets for 3 minutes before giving up)?

On Tue, Mar 22, 2016 at 5:27 PM Paul Gale <pa...@gmail.com> wrote:

> We're using the following with our NetApp device:
>
> -rw,soft,intr,noexec,nosuid,nodev,users,rsize=65535,wsize=65535,proto=tcp
>
> Whilst 'soft' is generally not recommended I've found that using 'hard'
> causes the broker to lock up immediately.
> The settings we have in place above were chosen by our storage team.
>
> Thanks,
> Paul
>