You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2022/10/02 03:00:00 UTC
[jira] [Commented] (DRILL-6268) Drill-on-YARN client obtains HDFS URL incorrectly
[ https://issues.apache.org/jira/browse/DRILL-6268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17612020#comment-17612020 ]
ASF GitHub Bot commented on DRILL-6268:
---------------------------------------
cgivre closed pull request #2139: DRILL-6268: Drill-on-YARN client obtains HDFS URL incorrectly
URL: https://github.com/apache/drill/pull/2139
> Drill-on-YARN client obtains HDFS URL incorrectly
> -------------------------------------------------
>
> Key: DRILL-6268
> URL: https://issues.apache.org/jira/browse/DRILL-6268
> Project: Apache Drill
> Issue Type: Bug
> Affects Versions: 1.13.0
> Reporter: Paul Rogers
> Assignee: Charles Givre
> Priority: Major
>
> The Drill-on-YARN client must upload files to HDFS so that YARN can localize them. The code that does so is in {{DfsFacade}}. This code obtains the URL twice. The first time is correct:
> {code}
> private void loadYarnConfig() {
> ...
> URI fsUri = FileSystem.getDefaultUri( yarnConf );
> if(fsUri.toString().startsWith("file:/")) {
> System.err.println("Warning: Default DFS URI is for a local file system: " + fsUri);
> }
> }
> }
> {code}
> The {{fsUri}} returned is {{hdfs://localhost:9000}}, which is the correct value for an out-of-the-box Hadoop 2.9.0 install after following [these instructions|https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SingleCluster.html]. The instructions have the reader explicitly set the port number to 9000:
> {code}
> <configuration>
> <property>
> <name>fs.defaultFS</name>
> <value>hdfs://localhost:9000</value>
> </property>
> </configuration>
> {code}
> The other place that gets the URL, this time or real, is {{DfsFacade.connect()}}:
> {code}
> String dfsConnection = config.getString(DrillOnYarnConfig.DFS_CONNECTION);
> {code}
> This value comes back as {{hdfs://localhost/}}, which causes HDFS to try to connect on port 8020 (the Hadoop default), resulting in the following error:
> {noformat}
> Connecting to DFS... Connected.
> Uploading /Users/paulrogers/bin/apache-drill-1.13.0.tar.gz to /users/drill/apache-drill-1.13.0.tar.gz ... Failed.
> Failed to upload Drill archive
> Caused by: Failed to create DFS directory: /users/drill
> Caused by: Call From Pauls-MBP/192.168.1.243 to localhost:8020 failed on connection exception: java.net.ConnectException: Connection refused;
> For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
> {noformat}
> (Shout out here to [~arjun-kr] for suggesting we include the extra exception details; very helpful here.)
> The workaround is to manually change the port to 8020 in the config setting shown above.
> The full fix is to change the code to use the following line in {{connect()}}:
> {code}
> String dfsConnection = FileSystem.getDefaultUri(yarnConf);
> {code}
> This bug is serious because it constrains the ability of users to select non-default HDFS ports.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)