You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sqoop.apache.org by "Mate Juhasz (JIRA)" <ji...@apache.org> on 2019/05/15 10:36:00 UTC

[jira] [Created] (SQOOP-3439) Sqoop export - All mappers are launched on the same node manager

Mate Juhasz created SQOOP-3439:
----------------------------------

             Summary: Sqoop export - All mappers are launched on the same node manager
                 Key: SQOOP-3439
                 URL: https://issues.apache.org/jira/browse/SQOOP-3439
             Project: Sqoop
          Issue Type: Bug
    Affects Versions: 1.4.7
            Reporter: Mate Juhasz


Found an interesting behaviour during Sqoop export to Oracle database, but its not only Oracle related. 

The debug level Yarn application logs shows, that the Application Master requests the containers correctly across more nodes, so the resource requests are correct.
But the Resource Manager for some weird reason allocates the containers on only one node. MR AM is requesting for only NODE_LOCAL containers on all hosts 

The blocks in hdfs fsck output show that they are spread across all datanodes, but there are 200 blocks which are grouped into 12 map splits by the inputformat class SqoopHCatExportFormat .

The issue seems to be because of the logic in SqoopHCatInputSplit#getLocations, which does a union of all the grouped split block locations.

{noformat}
  @Override
  public String[] getLocations() throws IOException, InterruptedException {
    if (this.hCatLocations == null) {
      Set<String> locations = new HashSet<String>();
      for (HCatSplit split : this.hCatSplits) {
        locations.addAll(Arrays.asList(split.getLocations()));
      }
      this.hCatLocations = locations.toArray(new String[0]);
    }
    return this.hCatLocations;
  }
{noformat}

Tried with --direct option where I guess the OraOopDBInputSplit shall be used instead, but it results in the same:

{noformat}
  @Override
  public String[] getLocations() throws IOException {
    if (this.splitLocation.isEmpty()) {
      return new String[] {};
    } else {
      return new String[] { this.splitLocation };
    }
  }
{noformat}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)