You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Imran Rashid (JIRA)" <ji...@apache.org> on 2018/11/16 22:32:00 UTC

[jira] [Commented] (SPARK-26094) Streaming WAL should create parent dirs

    [ https://issues.apache.org/jira/browse/SPARK-26094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16690075#comment-16690075 ] 

Imran Rashid commented on SPARK-26094:
--------------------------------------

when playing around with this, I noticed another difference -- {{fs.create()}} accepts relative paths, and {{fs.createFile()}} requires absolute files.  When I tried with a relative file, I got

{noformat}
java.lang.IllegalArgumentException: Pathname floop/blah from floop/blah is not a valid DFS filename.
  at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:233)
  at org.apache.hadoop.hdfs.DistributedFileSystem$10.doCall(DistributedFileSystem.java:563)
  at org.apache.hadoop.hdfs.DistributedFileSystem$10.doCall(DistributedFileSystem.java:560)
  at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
  at org.apache.hadoop.hdfs.DistributedFileSystem.createNonRecursive(DistributedFileSystem.java:581)
  at org.apache.hadoop.hdfs.DistributedFileSystem.access$800(DistributedFileSystem.java:121)
  at org.apache.hadoop.hdfs.DistributedFileSystem$HdfsDataOutputStreamBuilder.build(DistributedFileSystem.java:3026)
  ... 53 elided
{noformat}

> Streaming WAL should create parent dirs
> ---------------------------------------
>
>                 Key: SPARK-26094
>                 URL: https://issues.apache.org/jira/browse/SPARK-26094
>             Project: Spark
>          Issue Type: Improvement
>          Components: DStreams
>    Affects Versions: 3.0.0
>            Reporter: Imran Rashid
>            Assignee: Imran Rashid
>            Priority: Blocker
>
> SPARK-25871 introduced a regression in the streaming WAL -- it no longer makes all the parent dirs, so you may see an exception like this in cases that used to work:
> {noformat}
> 18/11/09 03:31:48 ERROR util.FileBasedWriteAheadLog_ReceiverSupervisorImpl: Failed to write to write ahead log after 3 failures
> ...
> org.apache.spark.SparkException: Exception thrown in awaitResult:
>         at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:226)
>         at org.apache.spark.streaming.receiver.WriteAheadLogBasedBlockHandler.storeBlock(ReceivedBlockHandler.scala:210)
> ...
> Caused by: java.io.FileNotFoundException: Parent directory doesn't exist: /tmp/__spark__1e8ba184-d323-47eb-b857-0e6285409424/88992/checkpoints/receivedData/0
>         at org.apache.hadoop.hdfs.server.namenode.FSDirectory.verifyParentDir(FSDirectory.java:1923)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org