You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@geode.apache.org by "Darrel Schneider (JIRA)" <ji...@apache.org> on 2016/02/10 20:31:18 UTC

[jira] [Updated] (GEODE-482) deserialization can hang for one minute waiting for a DataSerializer

     [ https://issues.apache.org/jira/browse/GEODE-482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Darrel Schneider updated GEODE-482:
-----------------------------------
    Component/s:     (was: core)
                 serialization

> deserialization can hang for one minute waiting for a DataSerializer
> --------------------------------------------------------------------
>
>                 Key: GEODE-482
>                 URL: https://issues.apache.org/jira/browse/GEODE-482
>             Project: Geode
>          Issue Type: Bug
>          Components: serialization
>            Reporter: Darrel Schneider
>
> If a JVM does not explicitly register a DataSerializer it is going to use but instead relies and Geode to distribute the DataSerializer to it from another member or server then a race condition exists that can cause it to wait for 1 minute and fail to find the DataSerializer.
> The work around for this is to explicitly register the DataSerializer using a static initializer or the cache.xml serializer element.
> A unit test was intermittently hitting this problem (see GEODE-376) but that test has been changed to workaround the race in the product.
> The race is in this code com.gemstone.gemfire.internal.InternalDataSerializer.getSerializer(int):
>     SerializerAttributesHolder sah=idsToHolders.get(idx);
>     while (result == null && !timedOut && sah == null) {
>       Object o = idsToSerializers.putIfAbsent(idx, marker);
>       if (o == null) {
>         result = marker.getSerializer();
> If getSerializer sees a null "sah" but before it can do the "idsToSerializers.putIfAbsent" another thread executes this code com.gemstone.gemfire.internal.InternalDataSerializer.register(String, boolean, SerializerAttributesHolder):
>     if (className == null || className.trim().equals("")) {
>       throw new IllegalArgumentException("Class name cannot be null or empty.");
>     }
>     SerializerAttributesHolder oldValue = dsClassesToHolders.putIfAbsent(
>         className, holder);
>     if (oldValue != null) {
>       if (oldValue.getId() != 0 && holder.getId() != 0
>           && oldValue.getId() != holder.getId()) {
>         throw new IllegalStateException(snip);
>      }
>     }
>     idsToHolders.putIfAbsent(holder.getId(), holder);
>     Object ds = idsToSerializers.get(holder.getId());
>     if (ds instanceof Marker) {
>       synchronized (ds) {
>         ((Marker)ds).notifyAll();
>       }
>     }
> So this thread does not see the Marker and does not notify it.
> That leaves the first thread stuck on Marker.getSerializer which blocks for 1 minute and then returns null.
> A new test needs to be written that will reliably fail for this bug.
> A multi-threaded unit test that uses these two methods would be best.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)