You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Laurent Goujon (JIRA)" <ji...@apache.org> on 2016/01/21 02:07:39 UTC

[jira] [Created] (DRILL-4292) Avoid multiple new Configuration/FileSystem instantiation

Laurent Goujon created DRILL-4292:
-------------------------------------

             Summary: Avoid multiple new Configuration/FileSystem instantiation 
                 Key: DRILL-4292
                 URL: https://issues.apache.org/jira/browse/DRILL-4292
             Project: Apache Drill
          Issue Type: Task
            Reporter: Laurent Goujon


There are lots of places where Drill code has the following pattern:
{noformat}
conf = new Configuration();
{noformat}

or
{noformat}
    Configuration conf = new Configuration();
    [...]
    fs = FileSystem.get(conf);
{noformat}

Creating Configuration instances is a pretty expensive operation as it triggers a classpath scan to find resources (this lazily happens when accessing a key). Also, these extra instances use more memory than expected (because the default resources are not shared between Configuration instances.

FileSystem instances should not be expensive to create, because by default, instances are cached (by scheme/authority/ugi), but it also means that the Configuration instance has little value when getting the FileSystem instnce (except when getting the default filesystem (can be replaced by FileSystem#get(URI, Configuration)) or creating the FileSystem for the given scheme, IF not already in the cache).

If possible, all these examples should be refactored to inject the right FileSystem instance, instead of letting each class trying to create its own.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)