You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Shixiong Zhu (JIRA)" <ji...@apache.org> on 2016/04/16 01:26:25 UTC

[jira] [Created] (SPARK-14678) Add a file sink log to support versioning and compaction

Shixiong Zhu created SPARK-14678:
------------------------------------

             Summary: Add a file sink log to support versioning and compaction
                 Key: SPARK-14678
                 URL: https://issues.apache.org/jira/browse/SPARK-14678
             Project: Spark
          Issue Type: Improvement
          Components: SQL
            Reporter: Shixiong Zhu
            Assignee: Shixiong Zhu


To use FileStreamSink in production, there are two requirements for FileStreamSink's log:

1.Versioning. A future Spark version should be able to read the metadata of an old FileStreamSink.
2. Compaction. As reading from many small files is usually pretty slow, we should compact small metadata files into big files.

See the PR description for more details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org