You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by shyla deshpande <de...@gmail.com> on 2017/02/16 22:40:10 UTC

Spark standalone cluster on EC2 error .. Checkpoint..

Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException): File
/checkpoint/11ea8862-122c-4614-bc7e-f761bb57ba23/rdd-347/.part-00001-attempt-3
could only be replicated to 0 nodes instead of minReplication (=1).  There
are 0 datanode(s) running and no node(s) are excluded in this operation.

This is the error I get when I run my spark streaming app on 2 node EC2
cluster, with 1 master and 1 worker.

Works fine in local mode. Please help.

Thanks

Re: Spark standalone cluster on EC2 error .. Checkpoint..

Posted by shyla deshpande <de...@gmail.com>.
Thanks TD and Marco for the feedback.

The directory referenced by SPARK_LOCAL_DIRS did not exist. After creating
that directory, it worked.

This was the first time I was trying to run spark on standalone cluster, so
I missed it.

Thanks

On Fri, Feb 17, 2017 at 12:35 PM, Tathagata Das <tathagata.das1565@gmail.com
> wrote:

> Seems like an issue with the HDFS you are using for checkpointing. Its not
> able to write data properly.
>
> On Thu, Feb 16, 2017 at 2:40 PM, shyla deshpande <deshpandeshyla@gmail.com
> > wrote:
>
>> Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException):
>> File /checkpoint/11ea8862-122c-4614-bc7e-f761bb57ba23/rdd-347/.part-00001-attempt-3
>> could only be replicated to 0 nodes instead of minReplication (=1).  There
>> are 0 datanode(s) running and no node(s) are excluded in this operation.
>>
>> This is the error I get when I run my spark streaming app on 2 node EC2
>> cluster, with 1 master and 1 worker.
>>
>> Works fine in local mode. Please help.
>>
>> Thanks
>>
>
>

Re: Spark standalone cluster on EC2 error .. Checkpoint..

Posted by Tathagata Das <ta...@gmail.com>.
Seems like an issue with the HDFS you are using for checkpointing. Its not
able to write data properly.

On Thu, Feb 16, 2017 at 2:40 PM, shyla deshpande <de...@gmail.com>
wrote:

> Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException):
> File /checkpoint/11ea8862-122c-4614-bc7e-f761bb57ba23/rdd-347/.part-00001-attempt-3
> could only be replicated to 0 nodes instead of minReplication (=1).  There
> are 0 datanode(s) running and no node(s) are excluded in this operation.
>
> This is the error I get when I run my spark streaming app on 2 node EC2
> cluster, with 1 master and 1 worker.
>
> Works fine in local mode. Please help.
>
> Thanks
>