You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "David Deuber (Jira)" <ji...@apache.org> on 2023/04/27 15:31:00 UTC

[jira] [Created] (SPARK-43310) Dataset.observe is ignored when writing to Kafka with batch query

David Deuber created SPARK-43310:
------------------------------------

             Summary: Dataset.observe is ignored when writing to Kafka with batch query
                 Key: SPARK-43310
                 URL: https://issues.apache.org/jira/browse/SPARK-43310
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 3.4.0, 3.3.2
            Reporter: David Deuber


When writing to Kafka with a batch query, metrics defined with {{Dataset.observe}} are not recorded. 

For example, 
{code:java}
import org.apache.spark.sql.execution.QueryExecution
import org.apache.spark.sql.util.QueryExecutionListener

spark.listenerManager.register(new QueryExecutionListener {
  override def onSuccess(funcName: String, qe: QueryExecution, durationNs: Long): Unit = {
    println(qe.observedMetrics)
  }

  override def onFailure(funcName: String, qe: QueryExecution, exception: Exception): Unit = {
    //pass
  }
})

val df = Seq(("k", "v")).toDF("key", "value")
val observed = df.observe("my_observation", lit("metric_value").as("some_metric"))
observed
  .write
  .format("kafka")
  .option("kafka.bootstrap.servers", "host1:port1")
  .option("topic", "topic1")
  .save()
{code}
prints {{{}Map(){}}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org