You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Sean Owen (JIRA)" <ji...@apache.org> on 2015/05/15 15:38:00 UTC
[jira] [Resolved] (SPARK-2769) Ganglia Support Broken / Not working
[ https://issues.apache.org/jira/browse/SPARK-2769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sean Owen resolved SPARK-2769.
------------------------------
Resolution: Cannot Reproduce
> Ganglia Support Broken / Not working
> ------------------------------------
>
> Key: SPARK-2769
> URL: https://issues.apache.org/jira/browse/SPARK-2769
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Affects Versions: 1.1.0
> Environment: Linux Red Hat 6.4 on Spark 1.1.0
> Reporter: Stephen Walsh
> Labels: Ganglia, GraphiteSink,, Metrics
>
> Hi all,
> I've build spark 1.1.0 with sbt with ganglia enabled and hadoop version 2.4.0
> No issues there, spark works fine on hadoop 2.4.0 and ganglia (GraphiteSink) is installed.
> I've added the following to the metrics.properties
> *.sink.graphite.class=org.apache.spark.metrics.sink.GraphiteSink
> *.sink.graphite.host=HOSTNAME
> *.sink.graphite.port=8649
> *.sink.graphite.period=1
> *.sink.graphite.prefix=aa
> and I get this error message
> 14/07/31 05:39:00 WARN graphite.GraphiteReporter: Unable to report to Graphite
> java.net.SocketException: Broken pipe
> at java.net.SocketOutputStream.socketWrite0(Native Method)
> at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:113)
> at java.net.SocketOutputStream.write(SocketOutputStream.java:159)
> at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)
> at sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:291)
> at sun.nio.cs.StreamEncoder.implFlush(StreamEncoder.java:295)
> at sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:141)
> at java.io.OutputStreamWriter.flush(OutputStreamWriter.java:229)
> at java.io.BufferedWriter.flush(BufferedWriter.java:254)
> at com.codahale.metrics.graphite.Graphite.send(Graphite.java:77)
> at com.codahale.metrics.graphite.GraphiteReporter.reportGauge(GraphiteReporter.java:254)
> at com.codahale.metrics.graphite.GraphiteReporter.report(GraphiteReporter.java:156)
> at com.codahale.metrics.ScheduledReporter.report(ScheduledReporter.java:107)
> at com.codahale.metrics.ScheduledReporter$1.run(ScheduledReporter.java:86)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> From looking at the code I see the following.
> val graphite: Graphite = new Graphite(new InetSocketAddress(host, port))
> val reporter: GraphiteReporter = GraphiteReporter.forRegistry(registry)
> .convertDurationsTo(TimeUnit.MILLISECONDS)
> .convertRatesTo(TimeUnit.SECONDS)
> .prefixedWith(prefix)
> .build(graphite)
> https://github.com/apache/spark/blob/87bd1f9ef7d547ee54a8a83214b45462e0751efb/core/src/main/scala/org/apache/spark/metrics/sink/GraphiteSink.scala#L69
> Followed by
> override def start() {
> reporter.start(pollPeriod, pollUnit)
> }
> I noticed that the error fails when we first fry to send a message but nowhere do I see graphite.connect() being called?
> https://github.com/dropwizard/metrics/blob/master/metrics-graphite/src/main/java/com/codahale/metrics/graphite/Graphite.java#L62
> as it seems to fail on the send function..
> https://github.com/dropwizard/metrics/blob/master/metrics-graphite/src/main/java/com/codahale/metrics/graphite/Graphite.java#L77
> a with "this.writer" not initialized the "writer.write" will fail.
> The GraphiteBuilder doesn't call it either when creating the "reporter" object.
> https://github.com/dropwizard/metrics/blob/master/metrics-graphite/src/main/java/com/codahale/metrics/graphite/GraphiteReporter.java#L113
> Maybe I'm looking in the wrong area and I'm passing in the wrong values - but very little logging has me thinking it is a bug.
> EDIT:
> found out where the connect gets called.
> https://github.com/dropwizard/metrics/blob/master/metrics-graphite/src/main/java/com/codahale/metrics/graphite/GraphiteReporter.java#L153
> ad his is called from here
> https://github.com/dropwizard/metrics/blob/99dc540c2cbe6bb3be304e20449fb641c7f5382a/metrics-core/src/main/java/com/codahale/metrics/ScheduledReporter.java#L98
> which is called form here
> https://github.com/dropwizard/metrics/blob/99dc540c2cbe6bb3be304e20449fb641c7f5382a/metrics-core/src/main/java/com/codahale/metrics/ScheduledReporter.java#L98
> but the issue still stands. :/
> Edit 2:
> my ports are open and listening
> [root@rtr-dev-spark4 ~]# lsof -i :8649
> COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
> gmond 32173 ganglia 5u IPv4 3480253 0t0 UDP rtr-dev-spark4.ord2012:8649
> gmond 32173 ganglia 6u IPv4 3480255 0t0 TCP rtr-dev-spark4.ord2012:8649 (LISTEN)
> gmond 32173 ganglia 7u IPv4 3480257 0t0 UDP rtr-dev-spark4.ord2012:55523->rtr-dev-spark4.ord2012:8649
> Regards
> Steve
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org