You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Henning Blohm (JIRA)" <ji...@apache.org> on 2010/10/21 10:14:19 UTC

[jira] Created: (HADOOP-7004) Problem with org.apache.hadoop.conf.Configuration.REGISTRY

Problem with org.apache.hadoop.conf.Configuration.REGISTRY
----------------------------------------------------------

                 Key: HADOOP-7004
                 URL: https://issues.apache.org/jira/browse/HADOOP-7004
             Project: Hadoop Common
          Issue Type: Bug
         Environment: hadoop 0.20.2, hbase 0.20.6
            Reporter: Henning Blohm
            Priority: Minor


When reusing Configuration that has an added addResource(InputStream) a
reload of configuration will fail as the stream has been read before.

The reload gets triggered when addDefaultResource is called. That method
uses the REGISTRY static WeakHashMap to reach out to all reachable Configuration 
instances to reset their properties.

The method addDefaultResource is called by e.g. ConfigUtil in org.apache.hadoop.mapreduce.util (hadoop trunk) or 
JobConf (hadoop 0.20.2).

The problem has been observed in Hadoop 0.20.2 but the code in trunk has
essentially the same structure. 

There are a few problems here:

1. You cannot safely use addResource(InputStream), if Configuration
objects are to be re-used (you can however use addResource(URL) instead)

2. Modifying the state of Configuration instances at some later point in
time as a side effect of some class initialization in some completely
unrelated thread leads to unpredictable behavior (properties change under the hood)

3. Configuration instances keep context classloaders to find resources.  After redeployment these may not be "valid" anymore. 
As long as the Configuration instance has not been collected, addDefaultResource will still invoke reloadConfiguration on them. 
While that is harmless today (only resetting members), this looks like a ticking time bomb.

Suggestion:
Define all default resources in Configuration once. Do not hold on to
other configuration instances and do not modify their state as a side
effect of some other activity.

See also: http://osdir.com/ml/general/2010-10/msg25893.html



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.