You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Patrick McGloin <mc...@gmail.com> on 2017/05/19 12:55:59 UTC
Re: Is there a Kafka sink for Spark Structured Streaming
# Write key-value data from a DataFrame to a Kafka topic specified in an option
query = df \
.selectExpr("CAST(userId AS STRING) AS key", "to_json(struct(*)) AS value") \
.writeStream \
.format("kafka") \
.option("kafka.bootstrap.servers", "host1:port1,host2:port2") \
.option("topic", "topic1") \
.option("checkpointLocation", "/path/to/HDFS/dir") \
.start()
Described here:
https://databricks.com/blog/2017/04/26/processing-data-in-apache-kafka-with-structured-streaming-in-apache-spark-2-2.html
On 19 May 2017 at 10:45, <ka...@gmail.com> wrote:
> Is there a Kafka sink for Spark Structured Streaming ?
>
> Sent from my iPhone
>
Re: Is there a Kafka sink for Spark Structured Streaming
Posted by Michael Armbrust <mi...@databricks.com>.
There is an RC here. Please test!
http://apache-spark-developers-list.1001551.n3.nabble.com/VOTE-Apache-Spark-2-2-0-RC2-td21497.html
On Fri, May 19, 2017 at 4:07 PM, kant kodali <ka...@gmail.com> wrote:
> Hi Patrick,
>
> I am using 2.1.1 and I tried the above code you sent and I get
>
> "java.lang.UnsupportedOperationException: Data source kafka does not
> support streamed writing"
>
> so yeah this probably works only from Spark 2.2 onwards. I am not sure
> when it officially releases.
>
> Thanks!
>
> On Fri, May 19, 2017 at 8:39 AM, <ka...@gmail.com> wrote:
>
>> Hi!
>>
>> Is this possible possible in spark 2.1.1?
>>
>> Sent from my iPhone
>>
>> On May 19, 2017, at 5:55 AM, Patrick McGloin <mc...@gmail.com>
>> wrote:
>>
>> # Write key-value data from a DataFrame to a Kafka topic specified in an option
>> query = df \
>> .selectExpr("CAST(userId AS STRING) AS key", "to_json(struct(*)) AS value") \
>> .writeStream \
>> .format("kafka") \
>> .option("kafka.bootstrap.servers", "host1:port1,host2:port2") \
>> .option("topic", "topic1") \
>> .option("checkpointLocation", "/path/to/HDFS/dir") \
>> .start()
>>
>> Described here:
>>
>> https://databricks.com/blog/2017/04/26/processing-data-in-apache-kafka-with-structured-streaming-in-apache-spark-2-2.html
>>
>>
>>
>> On 19 May 2017 at 10:45, <ka...@gmail.com> wrote:
>>
>>> Is there a Kafka sink for Spark Structured Streaming ?
>>>
>>> Sent from my iPhone
>>>
>>
>>
>
Re: Is there a Kafka sink for Spark Structured Streaming
Posted by kant kodali <ka...@gmail.com>.
Thanks!
On Fri, May 19, 2017 at 4:50 PM, Tathagata Das <ta...@gmail.com>
wrote:
> Should release by the end of this month.
>
> On Fri, May 19, 2017 at 4:07 PM, kant kodali <ka...@gmail.com> wrote:
>
>> Hi Patrick,
>>
>> I am using 2.1.1 and I tried the above code you sent and I get
>>
>> "java.lang.UnsupportedOperationException: Data source kafka does not
>> support streamed writing"
>>
>> so yeah this probably works only from Spark 2.2 onwards. I am not sure
>> when it officially releases.
>>
>> Thanks!
>>
>> On Fri, May 19, 2017 at 8:39 AM, <ka...@gmail.com> wrote:
>>
>>> Hi!
>>>
>>> Is this possible possible in spark 2.1.1?
>>>
>>> Sent from my iPhone
>>>
>>> On May 19, 2017, at 5:55 AM, Patrick McGloin <mc...@gmail.com>
>>> wrote:
>>>
>>> # Write key-value data from a DataFrame to a Kafka topic specified in an option
>>> query = df \
>>> .selectExpr("CAST(userId AS STRING) AS key", "to_json(struct(*)) AS value") \
>>> .writeStream \
>>> .format("kafka") \
>>> .option("kafka.bootstrap.servers", "host1:port1,host2:port2") \
>>> .option("topic", "topic1") \
>>> .option("checkpointLocation", "/path/to/HDFS/dir") \
>>> .start()
>>>
>>> Described here:
>>>
>>> https://databricks.com/blog/2017/04/26/processing-data-in-apache-kafka-with-structured-streaming-in-apache-spark-2-2.html
>>>
>>>
>>>
>>> On 19 May 2017 at 10:45, <ka...@gmail.com> wrote:
>>>
>>>> Is there a Kafka sink for Spark Structured Streaming ?
>>>>
>>>> Sent from my iPhone
>>>>
>>>
>>>
>>
>
Re: Is there a Kafka sink for Spark Structured Streaming
Posted by Tathagata Das <ta...@gmail.com>.
Should release by the end of this month.
On Fri, May 19, 2017 at 4:07 PM, kant kodali <ka...@gmail.com> wrote:
> Hi Patrick,
>
> I am using 2.1.1 and I tried the above code you sent and I get
>
> "java.lang.UnsupportedOperationException: Data source kafka does not
> support streamed writing"
>
> so yeah this probably works only from Spark 2.2 onwards. I am not sure
> when it officially releases.
>
> Thanks!
>
> On Fri, May 19, 2017 at 8:39 AM, <ka...@gmail.com> wrote:
>
>> Hi!
>>
>> Is this possible possible in spark 2.1.1?
>>
>> Sent from my iPhone
>>
>> On May 19, 2017, at 5:55 AM, Patrick McGloin <mc...@gmail.com>
>> wrote:
>>
>> # Write key-value data from a DataFrame to a Kafka topic specified in an option
>> query = df \
>> .selectExpr("CAST(userId AS STRING) AS key", "to_json(struct(*)) AS value") \
>> .writeStream \
>> .format("kafka") \
>> .option("kafka.bootstrap.servers", "host1:port1,host2:port2") \
>> .option("topic", "topic1") \
>> .option("checkpointLocation", "/path/to/HDFS/dir") \
>> .start()
>>
>> Described here:
>>
>> https://databricks.com/blog/2017/04/26/processing-data-in-apache-kafka-with-structured-streaming-in-apache-spark-2-2.html
>>
>>
>>
>> On 19 May 2017 at 10:45, <ka...@gmail.com> wrote:
>>
>>> Is there a Kafka sink for Spark Structured Streaming ?
>>>
>>> Sent from my iPhone
>>>
>>
>>
>
Re: Is there a Kafka sink for Spark Structured Streaming
Posted by kant kodali <ka...@gmail.com>.
Hi Patrick,
I am using 2.1.1 and I tried the above code you sent and I get
"java.lang.UnsupportedOperationException: Data source kafka does not
support streamed writing"
so yeah this probably works only from Spark 2.2 onwards. I am not sure when
it officially releases.
Thanks!
On Fri, May 19, 2017 at 8:39 AM, <ka...@gmail.com> wrote:
> Hi!
>
> Is this possible possible in spark 2.1.1?
>
> Sent from my iPhone
>
> On May 19, 2017, at 5:55 AM, Patrick McGloin <mc...@gmail.com>
> wrote:
>
> # Write key-value data from a DataFrame to a Kafka topic specified in an option
> query = df \
> .selectExpr("CAST(userId AS STRING) AS key", "to_json(struct(*)) AS value") \
> .writeStream \
> .format("kafka") \
> .option("kafka.bootstrap.servers", "host1:port1,host2:port2") \
> .option("topic", "topic1") \
> .option("checkpointLocation", "/path/to/HDFS/dir") \
> .start()
>
> Described here:
>
> https://databricks.com/blog/2017/04/26/processing-data-in-apache-kafka-with-structured-streaming-in-apache-spark-2-2.html
>
>
>
> On 19 May 2017 at 10:45, <ka...@gmail.com> wrote:
>
>> Is there a Kafka sink for Spark Structured Streaming ?
>>
>> Sent from my iPhone
>>
>
>
Re: Is there a Kafka sink for Spark Structured Streaming
Posted by ka...@gmail.com.
Hi!
Is this possible possible in spark 2.1.1?
Sent from my iPhone
> On May 19, 2017, at 5:55 AM, Patrick McGloin <mc...@gmail.com> wrote:
>
> # Write key-value data from a DataFrame to a Kafka topic specified in an option
> query = df \
> .selectExpr("CAST(userId AS STRING) AS key", "to_json(struct(*)) AS value") \
> .writeStream \
> .format("kafka") \
> .option("kafka.bootstrap.servers", "host1:port1,host2:port2") \
> .option("topic", "topic1") \
> .option("checkpointLocation", "/path/to/HDFS/dir") \
> .start()
> Described here:
> https://databricks.com/blog/2017/04/26/processing-data-in-apache-kafka-with-structured-streaming-in-apache-spark-2-2.html
>
>
>> On 19 May 2017 at 10:45, <ka...@gmail.com> wrote:
>> Is there a Kafka sink for Spark Structured Streaming ?
>>
>> Sent from my iPhone
>