You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Sean Owen (JIRA)" <ji...@apache.org> on 2016/01/06 11:00:49 UTC

[jira] [Resolved] (SPARK-4624) Errors when reading/writtign to S3 large object files

     [ https://issues.apache.org/jira/browse/SPARK-4624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sean Owen resolved SPARK-4624.
------------------------------
    Resolution: Cannot Reproduce

This sounds like an S3 issue, but reopen if you can still reproduce and have more specific info about how to do it, and what exactly happens.

> Errors when reading/writtign to S3 large object files
> -----------------------------------------------------
>
>                 Key: SPARK-4624
>                 URL: https://issues.apache.org/jira/browse/SPARK-4624
>             Project: Spark
>          Issue Type: Bug
>          Components: EC2, Input/Output, Mesos
>    Affects Versions: 1.1.0
>         Environment: manually setup Mesos cluster in EC2 made of 30 c3.4xLArge Nodes
>            Reporter: Kriton Tsintaris
>            Priority: Critical
>
> My cluster is not configured to use hdfs. Instead the local disk of each node is used.
> I've got a number of huge RDD object files (each made of ~600 part files each of ~60 GB). They are updated extremely rarely.
> An example of the model of the data stored in these RDDs is the following: (Long, Array[Long]). 
> When I load them to my cluster, using val page_users = sc.objectFile[(Long,Array[Long])]("s3n://mybucket/path/myrdd.obj.rdd") or equivelant, sometimes data is missing (as if 1 or 2 of the part files was not sucesfuly loaded).
> What is more frustrating is that I get no errors that this has happened! Sometimes reading s3 timeouts or gets some errors but eventually auto-retries do succeed.
> Furthermore If I attempt to write an RDD back into S3, using myrdd.saveAsObjectFile("s3n://..."), the operation will again terminate before it was completed without any warning or indication of error.
> More specifically what will happen is that the object files parts will be left under a _temporary folder and only a few of them will have been moved in the correct "path" in s3. This only happens when I am writing huge object files. If my object file is just a few GB everything will be fine. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org