You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2018/06/01 08:50:00 UTC

[jira] [Commented] (FLINK-9366) Distribute Cache only works for client-accessible files

    [ https://issues.apache.org/jira/browse/FLINK-9366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16497750#comment-16497750 ] 

ASF GitHub Bot commented on FLINK-9366:
---------------------------------------

Github user dawidwys commented on a diff in the pull request:

    https://github.com/apache/flink/pull/6107#discussion_r192334113
  
    --- Diff: flink-fs-tests/src/test/java/org/apache/flink/hdfstests/DistributedCacheDfsTest.java ---
    @@ -0,0 +1,166 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.flink.hdfstests;
    +
    +import org.apache.flink.api.common.functions.RichMapFunction;
    +import org.apache.flink.core.fs.FileStatus;
    +import org.apache.flink.core.fs.FileSystem;
    +import org.apache.flink.core.fs.Path;
    +import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
    +import org.apache.flink.test.util.MiniClusterResource;
    +import org.apache.flink.util.NetUtils;
    +
    +import org.apache.hadoop.conf.Configuration;
    +import org.apache.hadoop.hdfs.MiniDFSCluster;
    +import org.junit.AfterClass;
    +import org.junit.BeforeClass;
    +import org.junit.ClassRule;
    +import org.junit.Test;
    +import org.junit.rules.TemporaryFolder;
    +
    +import java.io.DataInputStream;
    +import java.io.DataOutputStream;
    +import java.io.File;
    +import java.io.IOException;
    +import java.net.URI;
    +
    +import static org.junit.Assert.assertEquals;
    +import static org.junit.Assert.assertFalse;
    +import static org.junit.Assert.assertTrue;
    +
    +/**
    + * Tests for distributing files with {@link org.apache.flink.api.common.cache.DistributedCache} via HDFS.
    + */
    +public class DistributedCacheDfsTest {
    +
    +	private static final String testFileContent = "Goethe - Faust: Der Tragoedie erster Teil\n" + "Prolog im Himmel.\n"
    +		+ "Der Herr. Die himmlischen Heerscharen. Nachher Mephistopheles. Die drei\n" + "Erzengel treten vor.\n"
    +		+ "RAPHAEL: Die Sonne toent, nach alter Weise, In Brudersphaeren Wettgesang,\n"
    +		+ "Und ihre vorgeschriebne Reise Vollendet sie mit Donnergang. Ihr Anblick\n"
    +		+ "gibt den Engeln Staerke, Wenn keiner Sie ergruenden mag; die unbegreiflich\n"
    +		+ "hohen Werke Sind herrlich wie am ersten Tag.\n"
    +		+ "GABRIEL: Und schnell und unbegreiflich schnelle Dreht sich umher der Erde\n"
    +		+ "Pracht; Es wechselt Paradieseshelle Mit tiefer, schauervoller Nacht. Es\n"
    +		+ "schaeumt das Meer in breiten Fluessen Am tiefen Grund der Felsen auf, Und\n"
    +		+ "Fels und Meer wird fortgerissen Im ewig schnellem Sphaerenlauf.\n"
    +		+ "MICHAEL: Und Stuerme brausen um die Wette Vom Meer aufs Land, vom Land\n"
    +		+ "aufs Meer, und bilden wuetend eine Kette Der tiefsten Wirkung rings umher.\n"
    +		+ "Da flammt ein blitzendes Verheeren Dem Pfade vor des Donnerschlags. Doch\n"
    +		+ "deine Boten, Herr, verehren Das sanfte Wandeln deines Tags.";
    +
    +	@ClassRule
    +	public static TemporaryFolder tempFolder = new TemporaryFolder();
    +
    +	private static MiniClusterResource miniClusterResource;
    +	private static MiniDFSCluster hdfsCluster;
    +	private static Configuration conf = new Configuration();
    +
    +	private static Path testFile;
    +	private static Path testDir;
    +
    +	@BeforeClass
    +	public static void setup() throws Exception {
    +		File dataDir = tempFolder.newFolder();
    +
    +		conf.set(MiniDFSCluster.HDFS_MINIDFS_BASEDIR, dataDir.getAbsolutePath());
    +		MiniDFSCluster.Builder builder = new MiniDFSCluster.Builder(conf);
    +		hdfsCluster = builder.build();
    +
    +		String hdfsURI = "hdfs://"
    +			+ NetUtils.hostAndPortToUrlString(hdfsCluster.getURI().getHost(), hdfsCluster.getNameNodePort())
    +			+ "/";
    +
    +		miniClusterResource = new MiniClusterResource(
    --- End diff --
    
    It is not necessary. Changed it.


> Distribute Cache only works for client-accessible files
> -------------------------------------------------------
>
>                 Key: FLINK-9366
>                 URL: https://issues.apache.org/jira/browse/FLINK-9366
>             Project: Flink
>          Issue Type: Bug
>          Components: Client, Local Runtime
>    Affects Versions: 1.6.0
>            Reporter: Chesnay Schepler
>            Assignee: Dawid Wysakowicz
>            Priority: Blocker
>             Fix For: 1.6.0
>
>
> In FLINK-8620 the distributed cache was modified to the distribute files via the blob store, instead of downloading them from a distributed filesystem.
> Previously, taskmanagers would download requested files from the DFS. Now, they retrieve it form the blob store. This requires the client to preemptively upload all files used with distributed cache.
> As a result it is no longer possible to use the distributed cache for files that reside in a cluster-internal DFS, as the client cannot download it. This is a regression from the previous behavior and may break existing setups.
> [~aljoscha] [~dawidwys]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)