You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@activemq.apache.org by "Jigar Shah (Jira)" <ji...@apache.org> on 2019/09/05 17:48:00 UTC

[jira] [Commented] (ARTEMIS-2250) Shared store lock is not monitored while running

    [ https://issues.apache.org/jira/browse/ARTEMIS-2250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16923654#comment-16923654 ] 

Jigar Shah commented on ARTEMIS-2250:
-------------------------------------

Hello Bas,

We are running into the similar situation, while using Master-Slave setup on AWS/EFS 

"Live gives up the control and Backup gains the control, and a situation arises where both Live and Backup are active at the same time, manipulating Journal creating un-recoverable Journals situation at-times". We have observed this situation with Artemis 2.6.3 and also Artemis 2.7.0. In out QA env. this happens on-an-average once or twice a week. We are also trying the way to consistently reproduce on AWS/EFS but not there yet.

_"We were able to prevent the occurence by tweaking EFS connection settings so it does not occur anymore in our setup."_

You have mentioned above in the comment. It will be very helpful If possible can you please share connection setting/mount parameters you have used which works in your setup.

 

Many Thanks

 

> Shared store lock is not monitored while running
> ------------------------------------------------
>
>                 Key: ARTEMIS-2250
>                 URL: https://issues.apache.org/jira/browse/ARTEMIS-2250
>             Project: ActiveMQ Artemis
>          Issue Type: Bug
>          Components: Broker
>    Affects Versions: 2.6.4
>         Environment: AWS EFS (NFS)
>            Reporter: Bas
>            Priority: Major
>
> When using the shared store the live server can loose the lock on the journal but does not notice it. This can happen when a shared file system is being used like in AWS where we use EFS.
> This can cause problems when the live server regains the network file system connection and just continues to process messages. At some point the live or the backup quits because it notices changes on the filesystems which it did not do itself.
> We were able to prevent the occurence by tweaking EFS connection settings so it does not occur anymore in our setup.
> For artemis we would like to show our change maybe someone can review the change and see if it can be improved and adapted in artemis.
> Patch is here for master:
> https://github.com/emagiz/activemq-artemis/commit/788adfbd3e5a54c63eed0810b7377641684b6fe1.patch
> Pull request: 
> https://github.com/apache/activemq-artemis/pull/2547



--
This message was sent by Atlassian Jira
(v8.3.2#803003)