You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by Patrick McGloin <mc...@gmail.com> on 2017/05/19 12:55:59 UTC

Re: Is there a Kafka sink for Spark Structured Streaming

# Write key-value data from a DataFrame to a Kafka topic specified in an option
query = df \
  .selectExpr("CAST(userId AS STRING) AS key", "to_json(struct(*)) AS value") \
  .writeStream \
  .format("kafka") \
  .option("kafka.bootstrap.servers", "host1:port1,host2:port2") \
  .option("topic", "topic1") \
  .option("checkpointLocation", "/path/to/HDFS/dir") \
  .start()

Described here:

https://databricks.com/blog/2017/04/26/processing-data-in-apache-kafka-with-structured-streaming-in-apache-spark-2-2.html



On 19 May 2017 at 10:45, <ka...@gmail.com> wrote:

> Is there a Kafka sink for Spark Structured Streaming ?
>
> Sent from my iPhone
>

Re: Is there a Kafka sink for Spark Structured Streaming

Posted by Michael Armbrust <mi...@databricks.com>.

There is an RC here.  Please test!

http://apache-spark-developers-list.1001551.n3.nabble.com/VOTE-Apache-Spark-2-2-0-RC2-td21497.html

On Fri, May 19, 2017 at 4:07 PM, kant kodali <ka...@gmail.com> wrote:

> Hi Patrick,
>
> I am using 2.1.1 and I tried the above code you sent and I get
>
> "java.lang.UnsupportedOperationException: Data source kafka does not
> support streamed writing"
>
> so yeah this probably works only from Spark 2.2 onwards. I am not sure
> when it officially releases.
>
> Thanks!
>
> On Fri, May 19, 2017 at 8:39 AM, <ka...@gmail.com> wrote:
>
>> Hi!
>>
>> Is this possible possible in spark 2.1.1?
>>
>> Sent from my iPhone
>>
>> On May 19, 2017, at 5:55 AM, Patrick McGloin <mc...@gmail.com>
>> wrote:
>>
>> # Write key-value data from a DataFrame to a Kafka topic specified in an option
>> query = df \
>>   .selectExpr("CAST(userId AS STRING) AS key", "to_json(struct(*)) AS value") \
>>   .writeStream \
>>   .format("kafka") \
>>   .option("kafka.bootstrap.servers", "host1:port1,host2:port2") \
>>   .option("topic", "topic1") \
>>   .option("checkpointLocation", "/path/to/HDFS/dir") \
>>   .start()
>>
>> Described here:
>>
>> https://databricks.com/blog/2017/04/26/processing-data-in-apache-kafka-with-structured-streaming-in-apache-spark-2-2.html
>>
>>
>>
>> On 19 May 2017 at 10:45, <ka...@gmail.com> wrote:
>>
>>> Is there a Kafka sink for Spark Structured Streaming ?
>>>
>>> Sent from my iPhone
>>>
>>
>>
>

Re: Is there a Kafka sink for Spark Structured Streaming

Posted by kant kodali <ka...@gmail.com>.

Thanks!

On Fri, May 19, 2017 at 4:50 PM, Tathagata Das <ta...@gmail.com>
wrote:

> Should release by the end of this month.
>
> On Fri, May 19, 2017 at 4:07 PM, kant kodali <ka...@gmail.com> wrote:
>
>> Hi Patrick,
>>
>> I am using 2.1.1 and I tried the above code you sent and I get
>>
>> "java.lang.UnsupportedOperationException: Data source kafka does not
>> support streamed writing"
>>
>> so yeah this probably works only from Spark 2.2 onwards. I am not sure
>> when it officially releases.
>>
>> Thanks!
>>
>> On Fri, May 19, 2017 at 8:39 AM, <ka...@gmail.com> wrote:
>>
>>> Hi!
>>>
>>> Is this possible possible in spark 2.1.1?
>>>
>>> Sent from my iPhone
>>>
>>> On May 19, 2017, at 5:55 AM, Patrick McGloin <mc...@gmail.com>
>>> wrote:
>>>
>>> # Write key-value data from a DataFrame to a Kafka topic specified in an option
>>> query = df \
>>>   .selectExpr("CAST(userId AS STRING) AS key", "to_json(struct(*)) AS value") \
>>>   .writeStream \
>>>   .format("kafka") \
>>>   .option("kafka.bootstrap.servers", "host1:port1,host2:port2") \
>>>   .option("topic", "topic1") \
>>>   .option("checkpointLocation", "/path/to/HDFS/dir") \
>>>   .start()
>>>
>>> Described here:
>>>
>>> https://databricks.com/blog/2017/04/26/processing-data-in-apache-kafka-with-structured-streaming-in-apache-spark-2-2.html
>>>
>>>
>>>
>>> On 19 May 2017 at 10:45, <ka...@gmail.com> wrote:
>>>
>>>> Is there a Kafka sink for Spark Structured Streaming ?
>>>>
>>>> Sent from my iPhone
>>>>
>>>
>>>
>>
>

Re: Is there a Kafka sink for Spark Structured Streaming

Posted by Tathagata Das <ta...@gmail.com>.

Should release by the end of this month.

On Fri, May 19, 2017 at 4:07 PM, kant kodali <ka...@gmail.com> wrote:

> Hi Patrick,
>
> I am using 2.1.1 and I tried the above code you sent and I get
>
> "java.lang.UnsupportedOperationException: Data source kafka does not
> support streamed writing"
>
> so yeah this probably works only from Spark 2.2 onwards. I am not sure
> when it officially releases.
>
> Thanks!
>
> On Fri, May 19, 2017 at 8:39 AM, <ka...@gmail.com> wrote:
>
>> Hi!
>>
>> Is this possible possible in spark 2.1.1?
>>
>> Sent from my iPhone
>>
>> On May 19, 2017, at 5:55 AM, Patrick McGloin <mc...@gmail.com>
>> wrote:
>>
>> # Write key-value data from a DataFrame to a Kafka topic specified in an option
>> query = df \
>>   .selectExpr("CAST(userId AS STRING) AS key", "to_json(struct(*)) AS value") \
>>   .writeStream \
>>   .format("kafka") \
>>   .option("kafka.bootstrap.servers", "host1:port1,host2:port2") \
>>   .option("topic", "topic1") \
>>   .option("checkpointLocation", "/path/to/HDFS/dir") \
>>   .start()
>>
>> Described here:
>>
>> https://databricks.com/blog/2017/04/26/processing-data-in-apache-kafka-with-structured-streaming-in-apache-spark-2-2.html
>>
>>
>>
>> On 19 May 2017 at 10:45, <ka...@gmail.com> wrote:
>>
>>> Is there a Kafka sink for Spark Structured Streaming ?
>>>
>>> Sent from my iPhone
>>>
>>
>>
>

Re: Is there a Kafka sink for Spark Structured Streaming

Posted by kant kodali <ka...@gmail.com>.

Hi Patrick,

I am using 2.1.1 and I tried the above code you sent and I get

"java.lang.UnsupportedOperationException: Data source kafka does not
support streamed writing"

so yeah this probably works only from Spark 2.2 onwards. I am not sure when
it officially releases.

Thanks!

On Fri, May 19, 2017 at 8:39 AM, <ka...@gmail.com> wrote:

> Hi!
>
> Is this possible possible in spark 2.1.1?
>
> Sent from my iPhone
>
> On May 19, 2017, at 5:55 AM, Patrick McGloin <mc...@gmail.com>
> wrote:
>
> # Write key-value data from a DataFrame to a Kafka topic specified in an option
> query = df \
>   .selectExpr("CAST(userId AS STRING) AS key", "to_json(struct(*)) AS value") \
>   .writeStream \
>   .format("kafka") \
>   .option("kafka.bootstrap.servers", "host1:port1,host2:port2") \
>   .option("topic", "topic1") \
>   .option("checkpointLocation", "/path/to/HDFS/dir") \
>   .start()
>
> Described here:
>
> https://databricks.com/blog/2017/04/26/processing-data-in-apache-kafka-with-structured-streaming-in-apache-spark-2-2.html
>
>
>
> On 19 May 2017 at 10:45, <ka...@gmail.com> wrote:
>
>> Is there a Kafka sink for Spark Structured Streaming ?
>>
>> Sent from my iPhone
>>
>
>

Re: Is there a Kafka sink for Spark Structured Streaming

Posted by ka...@gmail.com.

Hi!

Is this possible possible in spark 2.1.1?

Sent from my iPhone

> On May 19, 2017, at 5:55 AM, Patrick McGloin <mc...@gmail.com> wrote:
> 
> # Write key-value data from a DataFrame to a Kafka topic specified in an option
> query = df \
>   .selectExpr("CAST(userId AS STRING) AS key", "to_json(struct(*)) AS value") \
>   .writeStream \
>   .format("kafka") \
>   .option("kafka.bootstrap.servers", "host1:port1,host2:port2") \
>   .option("topic", "topic1") \
>   .option("checkpointLocation", "/path/to/HDFS/dir") \
>   .start()
> Described here:
> https://databricks.com/blog/2017/04/26/processing-data-in-apache-kafka-with-structured-streaming-in-apache-spark-2-2.html
> 
> 
>> On 19 May 2017 at 10:45, <ka...@gmail.com> wrote:
>> Is there a Kafka sink for Spark Structured Streaming ?
>> 
>> Sent from my iPhone
>