You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by pw...@apache.org on 2013/12/31 19:13:19 UTC
[16/16] git commit: Merge pull request #289 from tdas/filestream-fix
Merge pull request #289 from tdas/filestream-fix
Bug fixes for file input stream and checkpointing
- Fixed bugs in the file input stream that led the stream to fail due to transient HDFS errors (listing files when a background thread it deleting fails caused errors, etc.)
- Updated Spark's CheckpointRDD and Streaming's CheckpointWriter to use SparkContext.hadoopConfiguration, to allow checkpoints to be written to any HDFS compatible store requiring special configuration.
- Changed the API of SparkContext.setCheckpointDir() - eliminated the unnecessary 'useExisting' parameter. Now SparkContext will always create a unique subdirectory within the user specified checkpoint directory. This is to ensure that previous checkpoint files are not accidentally overwritten.
- Fixed bug where setting checkpoint directory as a relative local path caused the checkpointing to fail.
Project: http://git-wip-us.apache.org/repos/asf/incubator-spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-spark/commit/55b7e2fd
Tree: http://git-wip-us.apache.org/repos/asf/incubator-spark/tree/55b7e2fd
Diff: http://git-wip-us.apache.org/repos/asf/incubator-spark/diff/55b7e2fd
Branch: refs/heads/master
Commit: 55b7e2fdffc6c3537da69152a3d02d5be599fa1b
Parents: 50e3b8e fcd17a1
Author: Patrick Wendell <pw...@gmail.com>
Authored: Tue Dec 31 10:12:51 2013 -0800
Committer: Patrick Wendell <pw...@gmail.com>
Committed: Tue Dec 31 10:12:51 2013 -0800
----------------------------------------------------------------------
.../scala/org/apache/spark/SparkContext.scala | 23 +--
.../spark/api/java/JavaSparkContext.scala | 15 +-
.../org/apache/spark/rdd/CheckpointRDD.scala | 32 ++--
.../apache/spark/rdd/RDDCheckpointData.scala | 15 +-
.../scala/org/apache/spark/JavaAPISuite.java | 4 +-
python/pyspark/context.py | 9 +-
python/pyspark/tests.py | 4 +-
.../org/apache/spark/streaming/Checkpoint.scala | 52 ++++---
.../spark/streaming/StreamingContext.scala | 32 ++--
.../api/java/JavaStreamingContext.scala | 13 +-
.../streaming/dstream/FileInputDStream.scala | 153 +++++++++++--------
.../streaming/scheduler/JobGenerator.scala | 67 +++++---
.../spark/streaming/CheckpointSuite.scala | 44 +++---
.../spark/streaming/InputStreamsSuite.scala | 2 +-
14 files changed, 269 insertions(+), 196 deletions(-)
----------------------------------------------------------------------
http://git-wip-us.apache.org/repos/asf/incubator-spark/blob/55b7e2fd/core/src/main/scala/org/apache/spark/SparkContext.scala
----------------------------------------------------------------------