You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Steve Loughran (Commented) (JIRA)" <ji...@apache.org> on 2012/02/18 22:22:05 UTC

[jira] [Commented] (HADOOP-7614) Reloading configuration when using imputstream resources results in org.xml.sax.SAXParseException

    [ https://issues.apache.org/jira/browse/HADOOP-7614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13211093#comment-13211093 ] 

Steve Loughran commented on HADOOP-7614:
----------------------------------------

Configurations last a long time and can get passed around, so I'm reluctant do anything that could leak things or even add XML files to the in-memory configuration.

removing the behaviour altogether would break things (it's hard to tell from the general Object interface)

I think in this situation, handling reloading by doing nothing may be the best tactic; possibly warning the user somehow. Once an inputstream has been loaded, the {{loadDefaults}} flag could be set to false. Yet that would change behaviour in copied configurations, that will not crash on a reload.

                
> Reloading configuration when using imputstream resources results in org.xml.sax.SAXParseException
> -------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-7614
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7614
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: conf
>    Affects Versions: 0.21.0
>            Reporter: Ferdy Galema
>            Priority: Minor
>         Attachments: HADOOP-7614-v1.patch, HADOOP-7614-v2.patch
>
>
> When using an inputstream as a resource for configuration, reloading this configuration will throw the following exception:
> Exception in thread "main" java.lang.RuntimeException: org.xml.sax.SAXParseException: Premature end of file.
> 	at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1576)
> 	at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:1445)
> 	at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:1381)
> 	at org.apache.hadoop.conf.Configuration.get(Configuration.java:569)
> ...
> Caused by: org.xml.sax.SAXParseException: Premature end of file.
> 	at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:249)
> 	at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:284)
> 	at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:124)
> 	at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1504)
> 	... 4 more
> To reproduce see following testcode:
>     Configuration conf = new Configuration();
>     ByteArrayInputStream bais = new ByteArrayInputStream("<configuration></configuration>".getBytes());
>     conf.addResource(bais);
>     System.out.println(conf.get("blah"));
>     conf.addResource("core-site.xml"); //just add a named resource, doesn't matter which one
>     System.out.println(conf.get("blah"));
> Allowing inputstream resources is flexible, but in cases such as this in can lead to difficult to debug problems.
> What do you think is the best solution? We could:
> A) reset the inputstream after it is read instead of closing it (but what to do when the stream does not support marking?)
> B) leave it up to the client (for example make sure you implement close() so that it resets the steam)
> C) when reading the inputstream for the first time, cache or wrap the contents somehow so that is can be read multiple times (let's at least document it)
> D) remove inputstream method altogether
> e) something else?
> For now I have attached a patch for solution A.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira