You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2022/10/02 03:00:00 UTC

[jira] [Commented] (DRILL-6268) Drill-on-YARN client obtains HDFS URL incorrectly

    [ https://issues.apache.org/jira/browse/DRILL-6268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17612020#comment-17612020 ] 

ASF GitHub Bot commented on DRILL-6268:
---------------------------------------

cgivre closed pull request #2139: DRILL-6268: Drill-on-YARN client obtains HDFS URL incorrectly
URL: https://github.com/apache/drill/pull/2139




> Drill-on-YARN client obtains HDFS URL incorrectly
> -------------------------------------------------
>
>                 Key: DRILL-6268
>                 URL: https://issues.apache.org/jira/browse/DRILL-6268
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 1.13.0
>            Reporter: Paul Rogers
>            Assignee: Charles Givre
>            Priority: Major
>
> The Drill-on-YARN client must upload files to HDFS so that YARN can localize them. The code that does so is in {{DfsFacade}}. This code obtains the URL twice. The first time is correct:
> {code}
>   private void loadYarnConfig() {
>     ...
>       URI fsUri = FileSystem.getDefaultUri( yarnConf );
>       if(fsUri.toString().startsWith("file:/")) {
>         System.err.println("Warning: Default DFS URI is for a local file system: " + fsUri);
>       }
>     }
>   }
> {code}
> The {{fsUri}} returned is {{hdfs://localhost:9000}}, which is the correct value for an out-of-the-box Hadoop 2.9.0 install after following [these instructions|https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SingleCluster.html]. The instructions have the reader explicitly set the port number to 9000:
> {code}
> <configuration>
>     <property>
>         <name>fs.defaultFS</name>
>         <value>hdfs://localhost:9000</value>
>     </property>
> </configuration>
> {code}
> The other place that gets the URL, this time or real, is {{DfsFacade.connect()}}:
> {code}
>     String dfsConnection = config.getString(DrillOnYarnConfig.DFS_CONNECTION);
> {code}
> This value comes back as {{hdfs://localhost/}}, which causes HDFS to try to connect on port 8020 (the Hadoop default), resulting in the following error:
> {noformat}
> Connecting to DFS... Connected.
> Uploading /Users/paulrogers/bin/apache-drill-1.13.0.tar.gz to /users/drill/apache-drill-1.13.0.tar.gz ... Failed.
> Failed to upload Drill archive
>   Caused by: Failed to create DFS directory: /users/drill
>   Caused by: Call From Pauls-MBP/192.168.1.243 to localhost:8020 failed on connection exception: java.net.ConnectException: Connection refused;
> For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
> {noformat}
> (Shout out here to [~arjun-kr] for suggesting we include the extra exception details; very helpful here.)
> The workaround is to manually change the port to 8020 in the config setting shown above.
> The full fix is to change the code to use the following line in {{connect()}}:
> {code}
>     String dfsConnection = FileSystem.getDefaultUri(yarnConf);
> {code}
> This bug is serious because it constrains the ability of users to select non-default HDFS ports.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)