You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by ColinMc <co...@shiftenergy.com> on 2015/03/12 17:58:08 UTC

KafkaUtils and specifying a specific partition

Hi,

How do you use KafkaUtils to specify a specific partition? I'm writing
customer Marathon jobs where a customer is given 1 partition in a topic in
Kafka. The job will get the partition from our database for that customer
and use that to get the messages for that customer.

I misinterpreted KafkaUtils when creating the stream and didn't know that it
was the number of partitions per topic in the map.

If KafkaUtils doesn't support this, is there another Spark API call for
Kafka that supports this?



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/KafkaUtils-and-specifying-a-specific-partition-tp22018.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org

Re: KafkaUtils and specifying a specific partition

Posted by mykidong <my...@gmail.com>.

If you want to use another kafka receiver instead of current spark kafka 
receiver, You can see this:
https://github.com/mykidong/spark-kafka-simple-consumer-receiver/blob/master/src/main/java/spark/streaming/receiver/kafka/KafkaReceiverUtils.java

You can handle to get just the stream from the specified partition.

- Kidong.


------ Original Message ------
From: "ColinMc [via Apache Spark User List]" 
<ml...@n3.nabble.com>
To: "mykidong" <my...@gmail.com>
Sent: 2015-03-13 오전 1:58:08
Subject: KafkaUtils and specifying a specific partition

>Hi,
>
>How do you use KafkaUtils to specify a specific partition? I'm writing 
>customer Marathon jobs where a customer is given 1 partition in a topic 
>in Kafka. The job will get the partition from our database for that 
>customer and use that to get the messages for that customer.
>
>I misinterpreted KafkaUtils when creating the stream and didn't know 
>that it was the number of partitions per topic in the map.
>
>If KafkaUtils doesn't support this, is there another Spark API call for 
>Kafka that supports this?
>
>--------------------------------------------------------------------------------
>If you reply to this email, your message will be added to the 
>discussion below:
>http://apache-spark-user-list.1001560.n3.nabble.com/KafkaUtils-and-specifying-a-specific-partition-tp22018.html
>To unsubscribe from Apache Spark User List, click here.
>NAML



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/KafkaUtils-and-specifying-a-specific-partition-tp22018p22023.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: KafkaUtils and specifying a specific partition

Posted by Akhil Das <ak...@sigmoidanalytics.com>.

Here's a simple consumer which does that
https://github.com/dibbhatt/kafka-spark-consumer/

Thanks
Best Regards

On Thu, Mar 12, 2015 at 10:28 PM, ColinMc <co...@shiftenergy.com>
wrote:

> Hi,
>
> How do you use KafkaUtils to specify a specific partition? I'm writing
> customer Marathon jobs where a customer is given 1 partition in a topic in
> Kafka. The job will get the partition from our database for that customer
> and use that to get the messages for that customer.
>
> I misinterpreted KafkaUtils when creating the stream and didn't know that
> it
> was the number of partitions per topic in the map.
>
> If KafkaUtils doesn't support this, is there another Spark API call for
> Kafka that supports this?
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/KafkaUtils-and-specifying-a-specific-partition-tp22018.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>

Re: KafkaUtils and specifying a specific partition

Posted by Cody Koeninger <co...@koeninger.org>.

KafkaUtils.createDirectStream, added in spark 1.3, will let you specify a
particular topic and partition

On Thu, Mar 12, 2015 at 1:07 PM, Colin McQueen <
colin.mcqueen@shiftenergy.com> wrote:

> Thanks! :)
>
> Colin McQueen
> *Software Developer*
>
> On Thu, Mar 12, 2015 at 3:05 PM, Jeffrey Jedele <je...@gmail.com>
> wrote:
>
>> Hi Colin,
>> my understanding is that this is currently not possible with KafkaUtils.
>> You would have to write a custom receiver using Kafka's SimpleConsumer API.
>>
>> https://spark.apache.org/docs/1.2.0/streaming-custom-receivers.html
>>
>> https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+SimpleConsumer+Example
>>
>> Regards,
>> Jeff
>>
>> 2015-03-12 17:58 GMT+01:00 ColinMc <co...@shiftenergy.com>:
>>
>>> Hi,
>>>
>>> How do you use KafkaUtils to specify a specific partition? I'm writing
>>> customer Marathon jobs where a customer is given 1 partition in a topic
>>> in
>>> Kafka. The job will get the partition from our database for that customer
>>> and use that to get the messages for that customer.
>>>
>>> I misinterpreted KafkaUtils when creating the stream and didn't know
>>> that it
>>> was the number of partitions per topic in the map.
>>>
>>> If KafkaUtils doesn't support this, is there another Spark API call for
>>> Kafka that supports this?
>>>
>>>
>>>
>>> --
>>> View this message in context:
>>> http://apache-spark-user-list.1001560.n3.nabble.com/KafkaUtils-and-specifying-a-specific-partition-tp22018.html
>>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>>> For additional commands, e-mail: user-help@spark.apache.org
>>>
>>>
>>
>

Re: KafkaUtils and specifying a specific partition

Posted by Colin McQueen <co...@shiftenergy.com>.

Thanks! :)

Colin McQueen
*Software Developer*

On Thu, Mar 12, 2015 at 3:05 PM, Jeffrey Jedele <je...@gmail.com>
wrote:

> Hi Colin,
> my understanding is that this is currently not possible with KafkaUtils.
> You would have to write a custom receiver using Kafka's SimpleConsumer API.
>
> https://spark.apache.org/docs/1.2.0/streaming-custom-receivers.html
>
> https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+SimpleConsumer+Example
>
> Regards,
> Jeff
>
> 2015-03-12 17:58 GMT+01:00 ColinMc <co...@shiftenergy.com>:
>
>> Hi,
>>
>> How do you use KafkaUtils to specify a specific partition? I'm writing
>> customer Marathon jobs where a customer is given 1 partition in a topic in
>> Kafka. The job will get the partition from our database for that customer
>> and use that to get the messages for that customer.
>>
>> I misinterpreted KafkaUtils when creating the stream and didn't know that
>> it
>> was the number of partitions per topic in the map.
>>
>> If KafkaUtils doesn't support this, is there another Spark API call for
>> Kafka that supports this?
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/KafkaUtils-and-specifying-a-specific-partition-tp22018.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>> For additional commands, e-mail: user-help@spark.apache.org
>>
>>
>

Re: KafkaUtils and specifying a specific partition

Posted by Jeffrey Jedele <je...@gmail.com>.

Hi Colin,
my understanding is that this is currently not possible with KafkaUtils.
You would have to write a custom receiver using Kafka's SimpleConsumer API.

https://spark.apache.org/docs/1.2.0/streaming-custom-receivers.html
https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+SimpleConsumer+Example

Regards,
Jeff

2015-03-12 17:58 GMT+01:00 ColinMc <co...@shiftenergy.com>:

> Hi,
>
> How do you use KafkaUtils to specify a specific partition? I'm writing
> customer Marathon jobs where a customer is given 1 partition in a topic in
> Kafka. The job will get the partition from our database for that customer
> and use that to get the messages for that customer.
>
> I misinterpreted KafkaUtils when creating the stream and didn't know that
> it
> was the number of partitions per topic in the map.
>
> If KafkaUtils doesn't support this, is there another Spark API call for
> Kafka that supports this?
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/KafkaUtils-and-specifying-a-specific-partition-tp22018.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>