You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by algermissen1971 <al...@icloud.com> on 2015/07/10 23:10:17 UTC
Spark Streaming and using Swift object store for checkpointing
Hi,
initially today when moving my streaming application to the cluster the first time I ran in to newbie error of using a local file system for checkpointing and the RDD partition count differences (see exception below).
Having neither HDFS nor S3 (and the Cassandra-Connector not yet supporting checkpointing[1]) I turned to Swift (which is already available in our architecture).
I mounted Swift using cloudfuse[2] I see the checkpoint files on all three cluster nodes - but still the job fails with the mentioned exception.
I experimented with cloudfuse caching settings but that does not *seem* to help.
Can anyone shed some light on this issue and provide a hint what I might be doing wrong here?
Jan
[1] https://datastax-oss.atlassian.net/browse/SPARKC-13
[2] https://github.com/redbo/cloudfuse
Exception:
org.apache.spark.SparkException: Checkpoint RDD CheckpointRDD[72] at print at App.scala:47(0) has different number of partitions than original RDD MapPartitionsRDD[70] at updateStateByKey at App.scala:47(2)
at org.apache.spark.rdd.RDDCheckpointData.doCheckpoint(RDDCheckpointData.scala:103)
at org.apache.spark.rdd.RDD$$anonfun$doCheckpoint$1.apply$mcV$sp(RDD.scala:1538)
at org.apache.spark.rdd.RDD$$anonfun$doCheckpoint$1.apply(RDD.scala:1535)
at org.apache.spark.rdd.RDD$$anonfun$doCheckpoint$1.apply(RDD.scala:1535)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:148)
at org.apache.spark.rdd.RDD.doCheckpoint(RDD.scala:1534)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1735)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1750)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1765)
at org.apache.spark.rdd.RDD$$anonfun$take$1.apply(RDD.scala:1272)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:148)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:109)
at org.apac....
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org
Re: Spark Streaming and using Swift object store for checkpointing
Posted by algermissen1971 <al...@icloud.com>.
On 10 Jul 2015, at 23:10, algermissen1971 <al...@icloud.com> wrote:
> Hi,
>
> initially today when moving my streaming application to the cluster the first time I ran in to newbie error of using a local file system for checkpointing and the RDD partition count differences (see exception below).
>
> Having neither HDFS nor S3 (and the Cassandra-Connector not yet supporting checkpointing[1]) I turned to Swift (which is already available in our architecture).
>
> I mounted Swift using cloudfuse[2] I see the checkpoint files on all three cluster nodes - but still the job fails with the mentioned exception.
>
> I experimented with cloudfuse caching settings but that does not *seem* to help.
>
> Can anyone shed some light on this issue and provide a hint what I might be doing wrong here?
In case this helps somebody else, here is what made it work for me after playing with all the options:
cloudfuse -o username=xxxx,api_key=xxxx,direct_io,hard_remove,entry_timeout=1,big_writes,cache_timeout=1,use_snet=True /swift
It seems I also had to remove the ~/.cloudfuse files I created previously for the options to take effect.
Jan
>
> Jan
>
> [1] https://datastax-oss.atlassian.net/browse/SPARKC-13
> [2] https://github.com/redbo/cloudfuse
>
>
>
> Exception:
>
> org.apache.spark.SparkException: Checkpoint RDD CheckpointRDD[72] at print at App.scala:47(0) has different number of partitions than original RDD MapPartitionsRDD[70] at updateStateByKey at App.scala:47(2)
> at org.apache.spark.rdd.RDDCheckpointData.doCheckpoint(RDDCheckpointData.scala:103)
> at org.apache.spark.rdd.RDD$$anonfun$doCheckpoint$1.apply$mcV$sp(RDD.scala:1538)
> at org.apache.spark.rdd.RDD$$anonfun$doCheckpoint$1.apply(RDD.scala:1535)
> at org.apache.spark.rdd.RDD$$anonfun$doCheckpoint$1.apply(RDD.scala:1535)
> at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:148)
> at org.apache.spark.rdd.RDD.doCheckpoint(RDD.scala:1534)
> at org.apache.spark.SparkContext.runJob(SparkContext.scala:1735)
> at org.apache.spark.SparkContext.runJob(SparkContext.scala:1750)
> at org.apache.spark.SparkContext.runJob(SparkContext.scala:1765)
> at org.apache.spark.rdd.RDD$$anonfun$take$1.apply(RDD.scala:1272)
> at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:148)
> at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:109)
> at org.apac....
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org