You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@geode.apache.org by "Anilkumar Gingade (Jira)" <ji...@apache.org> on 2020/08/10 18:37:00 UTC

[jira] [Updated] (GEODE-6901) If a region is replicate and replicate persistent in different members and a replicate persistent member crashes, the replicate members throw a ToDataException attempting to synchronize the region

     [ https://issues.apache.org/jira/browse/GEODE-6901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Anilkumar Gingade updated GEODE-6901:
-------------------------------------
    Labels: GeodeOperationAPI caching-applications pull-request-available  (was: caching-applications pull-request-available)

> If a region is replicate and replicate persistent in different members and a replicate persistent member crashes, the replicate members throw a ToDataException attempting to synchronize the region
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: GEODE-6901
>                 URL: https://issues.apache.org/jira/browse/GEODE-6901
>             Project: Geode
>          Issue Type: Bug
>          Components: persistence, regions
>            Reporter: Barrett Oglesby
>            Priority: Major
>              Labels: GeodeOperationAPI, caching-applications, pull-request-available
>         Attachments: 0001-GEODE-6901-Modified-RegionVersionVector-to-handle-a-.patch
>
>
> If a region is replicate and replicate persistent in different members and a replicate persistent member crashes, the replicate members throw a ToDataException attempting to synchronize the region
> In this case, an exception like this is thrown in the replicate member:
> {noformat}
> [warn 2019/06/21 17:06:33.516 PDT <Timer-2> tid=0x2b] Timer task <or...@6477fd5e> encountered exception
> org.apache.geode.ToDataException: class org.apache.geode.internal.cache.versions.VMRegionVersionVector
>  at org.apache.geode.internal.InternalDataSerializer.invokeToData(InternalDataSerializer.java:2331)
>  at org.apache.geode.internal.InternalDataSerializer.writeDSFID(InternalDataSerializer.java:1492)
>  at org.apache.geode.internal.InternalDataSerializer.basicWriteObject(InternalDataSerializer.java:2067)
>  at org.apache.geode.DataSerializer.writeObject(DataSerializer.java:2943)
>  at org.apache.geode.internal.cache.InitialImageOperation$RequestImageMessage.toData(InitialImageOperation.java:2135)
>  at org.apache.geode.internal.InternalDataSerializer.invokeToData(InternalDataSerializer.java:2300)
>  at org.apache.geode.internal.InternalDataSerializer.writeDSFID(InternalDataSerializer.java:1492)
>  at org.apache.geode.internal.tcp.MsgStreamer.writeMessage(MsgStreamer.java:242)
>  at org.apache.geode.distributed.internal.direct.DirectChannel.sendToMany(DirectChannel.java:385)
>  at org.apache.geode.distributed.internal.direct.DirectChannel.sendToOne(DirectChannel.java:241)
>  at org.apache.geode.distributed.internal.direct.DirectChannel.send(DirectChannel.java:596)
>  at org.apache.geode.distributed.internal.membership.gms.mgr.GMSMembershipManager.directChannelSend(GMSMembershipManager.java:1711)
>  at org.apache.geode.distributed.internal.membership.gms.mgr.GMSMembershipManager.send(GMSMembershipManager.java:1892)
>  at org.apache.geode.distributed.internal.ClusterDistributionManager.sendViaMembershipManager(ClusterDistributionManager.java:2852)
>  at org.apache.geode.distributed.internal.ClusterDistributionManager.sendOutgoing(ClusterDistributionManager.java:2779)
>  at org.apache.geode.distributed.internal.ClusterDistributionManager.sendMessage(ClusterDistributionManager.java:2816)
>  at org.apache.geode.distributed.internal.ClusterDistributionManager.putOutgoing(ClusterDistributionManager.java:1526)
>  at org.apache.geode.internal.cache.InitialImageOperation.synchronizeWith(InitialImageOperation.java:649)
>  at org.apache.geode.internal.cache.DistributedRegion.synchronizeWith(DistributedRegion.java:1321)
>  at org.apache.geode.internal.cache.DistributedRegion.synchronizeForLostMember(DistributedRegion.java:1310)
>  at org.apache.geode.internal.cache.DistributedRegion.performSynchronizeForLostMemberTask(DistributedRegion.java:1295)
>  at org.apache.geode.internal.cache.DistributedRegion$1.run2(DistributedRegion.java:1285)
>  at org.apache.geode.internal.SystemTimer$SystemTimerTask.run(SystemTimer.java:445)
>  at java.util.TimerThread.mainLoop(Timer.java:555)
>  at java.util.TimerThread.run(Timer.java:505)
> Caused by: java.lang.ClassCastException: org.apache.geode.internal.cache.persistence.DiskStoreID cannot be cast to org.apache.geode.distributed.internal.membership.InternalDistributedMember
>  at org.apache.geode.internal.cache.versions.VMRegionVersionVector.writeMember(VMRegionVersionVector.java:31)
>  at org.apache.geode.internal.cache.versions.RegionVersionVector.toData(RegionVersionVector.java:1204)
>  at org.apache.geode.internal.InternalDataSerializer.invokeToData(InternalDataSerializer.java:2300)
>  ... 24 more
> {noformat}
> RegionVersionVector.java:1204 is here:
> {noformat}
>  for (Map.Entry<T, RegionVersionHolder<T>> entry : this.memberToVersion.entrySet()) {
> -> writeMember(entry.getKey(), out);
>  InternalDataSerializer.invokeToData(entry.getValue(), out);
>  }
> {noformat}
> VMRegionVersionVector expects the entries of the memberToVersion to be keyed by InternalDistributedMembers:
> {noformat}
> protected void writeMember(InternalDistributedMember member, DataOutput out) throws IOException {
> {noformat}
> Logging in RegionVersionVector.toData shows the RegionVersionVector in this member is a VMRegionVersionVector and its memberToVersion map contains DiskStoreIDs. This causes the ClassCastException.
> {noformat}
> This RegionVersionVector's (class=VMRegionVersionVector) memberToVersion map contains the following 1 entries:
>  member=402d383b29fa4c31-8597a3b72674bf5d; class=DiskStoreID
> {noformat}
> The documentation (https://gemfire.docs.pivotal.io/98/geode/managing/disk_storage/starting_system_with_disk_stores.html) makes it sound like this is a supported configuration:
> {noformat}
> For replicated regions, where you define persistence only in some of the region's host members, start the persistent replicate members prior to the non-persistent replicate members to make sure the data is recovered from disk.
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)