You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@geode.apache.org by "Juan José Ramos Cassella (JIRA)" <ji...@apache.org> on 2019/04/05 12:10:00 UTC

[jira] [Resolved] (GEODE-6551) Multiple Executions of RegionAlterFunction Leaves Partition Region Inconsistent

     [ https://issues.apache.org/jira/browse/GEODE-6551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Juan José Ramos Cassella resolved GEODE-6551.
---------------------------------------------
       Resolution: Fixed
    Fix Version/s: 1.10.0

> Multiple Executions of RegionAlterFunction Leaves Partition Region Inconsistent
> -------------------------------------------------------------------------------
>
>                 Key: GEODE-6551
>                 URL: https://issues.apache.org/jira/browse/GEODE-6551
>             Project: Geode
>          Issue Type: Bug
>          Components: configuration, gfsh, wan
>            Reporter: Juan José Ramos Cassella
>            Assignee: Juan José Ramos Cassella
>            Priority: Major
>             Fix For: 1.10.0
>
>          Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> When trying to assign a non-persistent parallel {{gateway-sender}} / {{async-event-queue}} to a persistent partitioned region through {{gfsh}}, the actual region is left inconsistent in the {{cluster configuration service}} if the internal function is executed more than once.
>  The problem is that the {{gateway-sender}} / {{async-event-queue}} is added to the internal list too early within the execution lifecycle and, if the actual addition fails afterwards, the internal list is never reverted to its original state. This invalid configuration is persisted into the cluster configuration service afterwards (for the second, "successful execution"), so the subsequent restart of the servers will miserably fail.
>  The following set of steps reproduces the problem for a {{gateway-sender}}, but the logic is exactly the same for an {{async-event-queue}}:
> {noformat}
> gfsh -e "start locator --name=locator --port=10101"
> gfsh -e "start server --name=server --server-port=40404 --locators=localhost[10101]"
> gfsh -e "connect --locator=localhost[10101]" -e "create disk-store --name=diskStore --dir=diskStore"
> gfsh -e "connect --locator=localhost[10101]" -e "create region --name=testRegion --type=PARTITION_PERSISTENT --disk-store=diskStore"
> gfsh -e "connect --locator=localhost[10101]" -e "create gateway-sender --id=gateway --parallel=true --remote-distributed-system-id=2 --enable-persistence=false"
> # First Execution Fails
> gfsh -e "connect --locator=localhost[10101]" -e "alter region --name=testRegion --gateway-sender-id=gateway"
> Member | Status | Message
> ------ | ------ | -------------------------------------------------------------------------------------------------------------------------------------------------------
> server | ERROR  |  org.apache.geode.internal.cache.wan.GatewaySenderException: Non persistent gateway sender gateway can not be attached to persistent region /testRegion
> # Second Execution Succeeds
> gfsh -e "connect --locator=localhost[10101]" -e "alter region --name=testRegion --gateway-sender-id=gateway"
> Member | Status | Message
> ------ | ------ | -------------------------
> server | OK     | Region testRegion altered
> gfsh -e "connect --locator=localhost[10101]" -e "stop server --name=server"
> gfsh -e "start server --name=server --server-port=40404 --locators=localhost[10101]"
> ....The Cache Server process terminated unexpectedly with exit status 1. Please refer to the log file in /server for full details.
> Exception in thread "main" org.apache.geode.internal.cache.wan.GatewaySenderException: Non persistent gateway sender gateway can not be attached to persistent region /testRegion
> 	at org.apache.geode.internal.cache.wan.parallel.ParallelGatewaySenderQueue.addShadowPartitionedRegionForUserPR(ParallelGatewaySenderQueue.java:454)
> # The log shows that the cluster configuration receiged is invalid:
> [info 2019/03/21 11:52:57.606 GMT <main> tid=0x1] Received cluster configuration from the locator
> [info 2019/03/21 11:52:57.638 GMT <main> tid=0x1] 
> ***************************************************************
> Configuration for  'cluster'
> Jar files to deployed
> <?xml version="1.0" encoding="UTF-8" standalone="no"?>
> <cache xmlns="http://geode.apache.org/schema/cache" xmlns:jdbc="http://geode.apache.org/schema/jdbc" xmlns:lucene="http://geode.apache.org/schema/lucene" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" version="1.0" xsi:schemaLocation="http://geode.apache.org/schema/lucene http://geode.apache.org/schema/lucene/lucene-1.0.xsd http://geode.apache.org/schema/jdbc http://geode.apache.org/schema/jdbc/jdbc-1.0.xsd http://geode.apache.org/schema/cache http://geode.apache.org/schema/cache/cache-1.0.xsd">
>     <gateway-sender disk-synchronous="true" enable-batch-conflation="false" enable-persistence="false" id="gateway" manual-start="false" parallel="true" remote-distributed-system-id="2"/>
>     <disk-store allow-force-compaction="false" auto-compact="true" compaction-threshold="50" disk-usage-critical-percentage="99" disk-usage-warning-percentage="90" max-oplog-size="1024" name="diskStore" queue-size="0" time-interval="1000" write-buffer-size="32768">
>         <disk-dirs>
>             <disk-dir dir-size="2147483647">diskStore</disk-dir>
>         </disk-dirs>
>     </disk-store>
>     <region name="testRegion" refid="PARTITION_PERSISTENT">
>         <region-attributes data-policy="persistent-partition" disk-store-name="diskStore" gateway-sender-ids="gateway"/>
>     </region>
> </cache>
> {noformat}
> Improve the current validations invoked from within the {{RegionAlterFunction}} and added through GEODE-4919 to also include the persistent checks (currently done in {{ParallelGatewaySenderQueue.addShadowPartitionedRegionForUserPR}}).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)