You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Matthew Dailey <ma...@gmail.com> on 2016/11/28 19:48:36 UTC

Re: Spark Metrics: custom source/sink configurations not getting recognized

I just stumbled upon this issue as well in Spark 1.6.2 when trying to write
my own custom Sink.  For anyone else who runs into this issue, there are
two relevant JIRAs that I found, but no solution as of yet:
- https://issues.apache.org/jira/browse/SPARK-14151 - Propose to refactor
and expose Metrics Sink and Source interface
- https://issues.apache.org/jira/browse/SPARK-18115 - Custom metrics
Sink/Source prevent Executor from starting

On Thu, Sep 8, 2016 at 3:23 PM, map reduced <k3...@gmail.com> wrote:

> Can this be listed as an issue on JIRA?
>
> On Wed, Sep 7, 2016 at 10:19 AM, map reduced <k3...@gmail.com> wrote:
>
>> Thanks for the reply, I wish it did. We have an internal metrics system
>> where we need to submit to. I am sure that the ways I've tried work with
>> yarn deployment, but not with standalone.
>>
>> Thanks,
>> KP
>>
>> On Tue, Sep 6, 2016 at 11:36 PM, Benjamin Kim <bb...@gmail.com> wrote:
>>
>>> We use Graphite/Grafana for custom metrics. We found Spark’s metrics not
>>> to be customizable. So, we write directly using Graphite’s API, which was
>>> very easy to do using Java’s socket library in Scala. It works great for
>>> us, and we are going one step further using Sensu to alert us if there is
>>> an anomaly in the metrics beyond the norm.
>>>
>>> Hope this helps.
>>>
>>> Cheers,
>>> Ben
>>>
>>>
>>> On Sep 6, 2016, at 9:52 PM, map reduced <k3...@gmail.com> wrote:
>>>
>>> Hi, anyone has any ideas please?
>>>
>>> On Mon, Sep 5, 2016 at 8:30 PM, map reduced <k3...@gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> I've written my custom metrics source/sink for my Spark streaming app
>>>> and I am trying to initialize it from metrics.properties - but that doesn't
>>>> work from executors. I don't have control on the machines in Spark cluster,
>>>> so I can't copy properties file in $SPARK_HOME/conf/ in the cluster. I have
>>>> it in the fat jar where my app lives, but by the time my fat jar is
>>>> downloaded on worker nodes in cluster, executors are already started and
>>>> their Metrics system is already initialized - thus not picking my file with
>>>> custom source configuration in it.
>>>>
>>>> Following this post
>>>> <https://stackoverflow.com/questions/38924581/spark-metrics-how-to-access-executor-and-worker-data>,
>>>> I've specified 'spark.files
>>>> <https://spark.apache.org/docs/latest/configuration.html> =
>>>> metrics.properties' and 'spark.metrics.conf=metrics.properties' but by
>>>> the time 'metrics.properties' is shipped to executors, their metric system
>>>> is already initialized.
>>>>
>>>> If I initialize my own metrics system, it's picking up my file but then
>>>> I'm missing master/executor level metrics/properties (eg.
>>>> executor.sink.mySink.propName=myProp - can't read 'propName' from
>>>> 'mySink') since they are initialized
>>>> <https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/master/Master.scala#L84> by
>>>> Spark's metric system.
>>>>
>>>> Is there a (programmatic) way to have 'metrics.properties' shipped
>>>> before executors initialize
>>>> <https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/SparkEnv.scala#L335>
>>>>  ?
>>>>
>>>> Here's my SO question
>>>> <https://stackoverflow.com/questions/39340080/spark-metrics-custom-source-sink-configurations-not-getting-recognized>
>>>> .
>>>>
>>>> Thanks,
>>>>
>>>> KP
>>>>
>>>
>>>
>>>
>>
>