You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Steve Loughran (JIRA)" <ji...@apache.org> on 2019/03/08 11:07:00 UTC
[jira] [Commented] (SPARK-27098) Flaky missing file parts when writing to Ceph without error

    [ https://issues.apache.org/jira/browse/SPARK-27098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16787792#comment-16787792 ] 

Steve Loughran commented on SPARK-27098:
----------------------------------------

It was my suggestion to file this. 

* If this was AWS S3 this would "just" be AWS S3's eventul consistency surfacing on renames: the directory listing needed to mimic the rename missing the newly committed files -most likely when the write is immediately before the rename (v2 task commit, v1 job commit + the straggler tasks), with the solutions being the standard ones: use a table in dynamo for list consistency, or a zero-rename-committer which doesn't need consistent listings. (Or: Iceberg)

* But this is Ceph, which is, AFAIK, consistent.

# Who has played with Ceph as the destination store for queries? Through the S3A libraries?
# What do people think can be enabled/added to the spark-level committers to detect this problem. The tasks know the files they've actually created and can report to the job committer -it could do a post-job-commit audit of the output and  fail if something is missing.

[~mwlon] is going to be the one trying to debug this. 

Martin:
#  you can get some more logging of what's up in the S3A code by setting the log for {{org.apache.hadoop.fs.s3a.S3AFileSystem}} to debug and looking for log entries beginning "Rename path". At least I think so, that 2.7.x codebase is 3+ years old, frozen for all but security fixes for 12 months, and never going to get another release (related to the AWS SDK, ironically). 
# the Hadoop 2.9.x releases do have S3Guard in, and while using a remote DDB table to add consistency to a local Ceph store is pretty inefficient, it'd be interesting to see whether enabling it would make this problem go away. In which case, you've just found a bug in Ceph
# [Ryan's S3 committers|https://github.com/rdblue/s3committer] do work with hadoop 2.7.x. Try them


> Flaky missing file parts when writing to Ceph without error
> -----------------------------------------------------------
>
>                 Key: SPARK-27098
>                 URL: https://issues.apache.org/jira/browse/SPARK-27098
>             Project: Spark
>          Issue Type: Bug
>          Components: Input/Output
>    Affects Versions: 2.4.0
>            Reporter: Martin Loncaric
>            Priority: Major
>
> https://stackoverflow.com/questions/54935822/spark-s3a-write-omits-upload-part-without-failure/55031233?noredirect=1#comment96835218_55031233
> Using 2.4.0 with Hadoop 2.7, hadoop-aws 2.7.5, and the Ceph S3 endpoint. occasionally a file part will be missing; i.e. part 00003 here:
> ```
> > aws s3 ls my-bucket/folder/
> 2019-02-28 13:07:21          0 _SUCCESS
> 2019-02-28 13:06:58   79428651 part-00000-5789ebf5-b55d-4715-8bb5-dfc5c4e4b999-c000.snappy.parquet
> 2019-02-28 13:06:59   79586172 part-00001-5789ebf5-b55d-4715-8bb5-dfc5c4e4b999-c000.snappy.parquet
> 2019-02-28 13:07:00   79561910 part-00002-5789ebf5-b55d-4715-8bb5-dfc5c4e4b999-c000.snappy.parquet
> 2019-02-28 13:07:01   79192617 part-00004-5789ebf5-b55d-4715-8bb5-dfc5c4e4b999-c000.snappy.parquet
> 2019-02-28 13:07:07   79364413 part-00005-5789ebf5-b55d-4715-8bb5-dfc5c4e4b999-c000.snappy.parquet
> 2019-02-28 13:07:08   79623254 part-00006-5789ebf5-b55d-4715-8bb5-dfc5c4e4b999-c000.snappy.parquet
> 2019-02-28 13:07:10   79445030 part-00007-5789ebf5-b55d-4715-8bb5-dfc5c4e4b999-c000.snappy.parquet
> 2019-02-28 13:07:10   79474923 part-00008-5789ebf5-b55d-4715-8bb5-dfc5c4e4b999-c000.snappy.parquet
> 2019-02-28 13:07:11   79477310 part-00009-5789ebf5-b55d-4715-8bb5-dfc5c4e4b999-c000.snappy.parquet
> 2019-02-28 13:07:12   79331453 part-00010-5789ebf5-b55d-4715-8bb5-dfc5c4e4b999-c000.snappy.parquet
> 2019-02-28 13:07:13   79567600 part-00011-5789ebf5-b55d-4715-8bb5-dfc5c4e4b999-c000.snappy.parquet
> 2019-02-28 13:07:13   79388012 part-00012-5789ebf5-b55d-4715-8bb5-dfc5c4e4b999-c000.snappy.parquet
> 2019-02-28 13:07:14   79308387 part-00013-5789ebf5-b55d-4715-8bb5-dfc5c4e4b999-c000.snappy.parquet
> 2019-02-28 13:07:15   79455483 part-00014-5789ebf5-b55d-4715-8bb5-dfc5c4e4b999-c000.snappy.parquet
> 2019-02-28 13:07:17   79512342 part-00015-5789ebf5-b55d-4715-8bb5-dfc5c4e4b999-c000.snappy.parquet
> 2019-02-28 13:07:18   79403307 part-00016-5789ebf5-b55d-4715-8bb5-dfc5c4e4b999-c000.snappy.parquet
> 2019-02-28 13:07:18   79617769 part-00017-5789ebf5-b55d-4715-8bb5-dfc5c4e4b999-c000.snappy.parquet
> 2019-02-28 13:07:19   79333534 part-00018-5789ebf5-b55d-4715-8bb5-dfc5c4e4b999-c000.snappy.parquet
> 2019-02-28 13:07:20   79543324 part-00019-5789ebf5-b55d-4715-8bb5-dfc5c4e4b999-c000.snappy.parquet
> ```
> However, the write succeeds and leaves a _SUCCESS file.
> This can be caught by additionally checking afterward whether the number of written file parts agrees with the number of partitions, but Spark should at least fail on its own and leave a meaningful stack trace in this case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org