You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Felix seibert (JIRA)" <ji...@apache.org> on 2019/05/19 11:23:00 UTC

[jira] [Commented] (FLINK-12550) hostnames with a dot never receive local input splits

    [ https://issues.apache.org/jira/browse/FLINK-12550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16843397#comment-16843397 ] 

Felix seibert commented on FLINK-12550:
---------------------------------------

After openining PR #8478 yesterday, I have some additional considerations.

The status quo is the following:
 * To check if an input split is locally available for a taskmanager, the hostname of the taskmanager is compared to the hostname of the input split. This happens in [this line|[https://github.com/apache/flink/blob/4fa387164cea44f8e0bac1aadab11433c0f0ff2b/flink-core/src/main/java/org/apache/flink/api/common/io/LocatableInputSplitAssigner.java#L223]:]

 
{code:java}
if (h != null && NetUtils.getHostnameFromFQDN(h.toLowerCase()).equals(flinkHost)){code}
h is the hostname of a machine hosting the input split, flinkHost is the taskmanager that is looking for an input split. NetUtils.getHostnameFromFQDN() truncates at the first occurrance of a ".". So, if a split is present on "host.domain", and the hostname of the taskmanager is "host.domain" too, we actually check whether "host".equals("host.domain") which is not true. PR #8478 applies getHostnameFromFQDN() on the taskmanager hostname as well, so it seems that this problem is fixed.

 

BUT. What if there is a taskmanager on host "host.cluster1.domain", and an input split on host "host.cluster2.domain"? isLocal() would recognize this split as being on the same host as the taskmanager, which is clearly not the case.

So to me it looks like getHostNameFromFQDN() shouldn't be applied on neither of the two compared hostnames.

Or is there any reason why it should be applied?

 

> hostnames with a dot never receive local input splits
> -----------------------------------------------------
>
>                 Key: FLINK-12550
>                 URL: https://issues.apache.org/jira/browse/FLINK-12550
>             Project: Flink
>          Issue Type: Bug
>          Components: API / DataSet
>    Affects Versions: 1.8.0
>            Reporter: Felix seibert
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> LocatableInputSplitAssigner (in package api.common.io) fails to assign local input splits to hosts whose hostname contains a dot ("."). To reproduce add the following test to LocatableSplitAssignerTest and execute it. It will always fail. In my mind, this is contrary to the expected behaviour, which is that the host should obtain the one split that is stored on the very same machine.
>  
> {code:java}
> @Test
> public void testLocalSplitAssignmentForHostWithDomainName() {
>    try {
>       String hostNameWithDot = "testhost.testdomain";
>       // load one split
>       Set<LocatableInputSplit> splits = new HashSet<LocatableInputSplit>();
>       splits.add(new LocatableInputSplit(0, hostNameWithDot));
>       // get next split for the host
>       LocatableInputSplitAssigner ia = new LocatableInputSplitAssigner(splits);
>       InputSplit is = null;
>       ia.getNextInputSplit(hostNameWithDot, 0);
>       // there should be exactly zero remote and one local assignment
>       assertEquals(0, ia.getNumberOfRemoteAssignments());
>       assertEquals(1, ia.getNumberOfLocalAssignments());
>    }
>    catch (Exception e) {
>       e.printStackTrace();
>       fail(e.getMessage());
>    }
> }
> {code}
> I also experienced this error in practice, and will later today open a pull request to fix it.
>  
> Note: I'm not sure if I selected the correct component category.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)