You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Rahul Singhal (JIRA)" <ji...@apache.org> on 2014/06/12 07:49:01 UTC

[jira] [Commented] (SPARK-2127) Use application specific folders to dump metrics via CsvSink

    [ https://issues.apache.org/jira/browse/SPARK-2127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14028854#comment-14028854 ] 

Rahul Singhal commented on SPARK-2127:
--------------------------------------

A PR is in the works.

> Use application specific folders to dump metrics via CsvSink
> ------------------------------------------------------------
>
>                 Key: SPARK-2127
>                 URL: https://issues.apache.org/jira/browse/SPARK-2127
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 1.0.0
>            Reporter: Rahul Singhal
>            Priority: Minor
>
> Currently when using the CsvSink, all application's csv metrics are dumped in the root folder (configured via "*.sink.csv.director" in metrics.properties). Also, some files that have common names (e.g. "jvm.PS-MarkSweep.count.csv") are reused. And if one is running the same application multiple times, the metrics get appended to previously existing files.
> This makes it harder to parse these files and extract the information that one might be looking for. I suggest that a unique folder is created every time an application is run and use it to dump the metrics from that particular run only. This unique folder could be created similar the one that is currently craeted for logging application events (e.g. "spark-pi-1402484928439").



--
This message was sent by Atlassian JIRA
(v6.2#6252)