You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2021/04/03 21:33:00 UTC

[jira] [Commented] (DRILL-6268) Drill-on-YARN client obtains HDFS URL incorrectly

    [ https://issues.apache.org/jira/browse/DRILL-6268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17314359#comment-17314359 ] 

ASF GitHub Bot commented on DRILL-6268:
---------------------------------------

paul-rogers commented on pull request #2139:
URL: https://github.com/apache/drill/pull/2139#issuecomment-812928381


   @cgivre, Revisiting this one. The fix is probably not correct. As explained earlier, the goal is to 1) use the YARN config by default, unless, 2) overridden in the DoY config file. Here is the default config from `drill-on-yarn-defaults.conf`:
   
   ```
   drill.yarn: {
     ...
     dfs: {
       connection: ""
       app-dir: "/user/drill"
     }
   ```
   
   The code says:
   
   ```java
       String dfsConnection = config.getString(DrillOnYarnConfig.DFS_CONNECTION);
       try {
         if (DoYUtil.isBlank(dfsConnection)) {
           fs = FileSystem.get(yarnConf);
   ```
   
   So, if the `dfs.connection` property is blank, use the one from the YARN config file.
   
   Again, why might there be a different DoY value? Because some users push apps to multiple servers, and the DoY config should be sufficient to do so, without having to have multiple different YARN configs available. (If, in practice, people use only one config, we can remove these DoY configs if not needed. But, let's assume they are needed.)
   
   So, the question is, why did the user see the bug which was reported? Where did the `"hdfs://localhost/"` value come from? **That** is the bug we need to fix.
   
   The answer seems to be that someone used `drill-on-yarn-example.conf` as their config, without inspecting if the *example* values are useful. (This is an *example*, not a *default*.):
   
   ```
   drill.yarn: {
     ...
     dfs: {
       # Connection to the distributed file system. Defaults to work with
       # a single-node Drill on the local machine.
       # Omit this if you want to get the configuration either from the
       # Hadoop config (set with config-dir above) or from the
       # $DRILL_HOME/core-site.xml.
   
       connection: "hdfs://localhost/"
   ```
   
   Why is that being used? The proper "default" file is `drill-on-yarn-override.conf` from `distribution`. But, it looks like the `component.xml` file is missing a line. So, maybe the user renamed the example file to `drill-on-yarn-override.conf`. We need:
   
   ```xml
       <file>
         <source>src/main/resources/drill-on-yarn-override.conf</source>
         <outputDirectory>conf</outputDirectory>
         <fileMode>0640</fileMode>
       </file>
   ```
   
   With the above Maven fix, we don't need to change the code: the code does what it is supposed to do, if given a proper (blank) config entry.
   
   An "extra for experts" fix is to add the updated port number to the example file above.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> Drill-on-YARN client obtains HDFS URL incorrectly
> -------------------------------------------------
>
>                 Key: DRILL-6268
>                 URL: https://issues.apache.org/jira/browse/DRILL-6268
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 1.13.0
>            Reporter: Paul Rogers
>            Assignee: Charles Givre
>            Priority: Major
>
> The Drill-on-YARN client must upload files to HDFS so that YARN can localize them. The code that does so is in {{DfsFacade}}. This code obtains the URL twice. The first time is correct:
> {code}
>   private void loadYarnConfig() {
>     ...
>       URI fsUri = FileSystem.getDefaultUri( yarnConf );
>       if(fsUri.toString().startsWith("file:/")) {
>         System.err.println("Warning: Default DFS URI is for a local file system: " + fsUri);
>       }
>     }
>   }
> {code}
> The {{fsUri}} returned is {{hdfs://localhost:9000}}, which is the correct value for an out-of-the-box Hadoop 2.9.0 install after following [these instructions|https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SingleCluster.html]. The instructions have the reader explicitly set the port number to 9000:
> {code}
> <configuration>
>     <property>
>         <name>fs.defaultFS</name>
>         <value>hdfs://localhost:9000</value>
>     </property>
> </configuration>
> {code}
> The other place that gets the URL, this time or real, is {{DfsFacade.connect()}}:
> {code}
>     String dfsConnection = config.getString(DrillOnYarnConfig.DFS_CONNECTION);
> {code}
> This value comes back as {{hdfs://localhost/}}, which causes HDFS to try to connect on port 8020 (the Hadoop default), resulting in the following error:
> {noformat}
> Connecting to DFS... Connected.
> Uploading /Users/paulrogers/bin/apache-drill-1.13.0.tar.gz to /users/drill/apache-drill-1.13.0.tar.gz ... Failed.
> Failed to upload Drill archive
>   Caused by: Failed to create DFS directory: /users/drill
>   Caused by: Call From Pauls-MBP/192.168.1.243 to localhost:8020 failed on connection exception: java.net.ConnectException: Connection refused;
> For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
> {noformat}
> (Shout out here to [~arjun-kr] for suggesting we include the extra exception details; very helpful here.)
> The workaround is to manually change the port to 8020 in the config setting shown above.
> The full fix is to change the code to use the following line in {{connect()}}:
> {code}
>     String dfsConnection = FileSystem.getDefaultUri(yarnConf);
> {code}
> This bug is serious because it constrains the ability of users to select non-default HDFS ports.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)