You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@storm.apache.org by 이승진 <sw...@navercorp.com> on 2014/10/17 05:45:32 UTC

Several questions regarding kafkaspout

Dear all,
 
1. For cases when there are multiple topics to read from, do I have to provide different IDs for each topic?
It seems like only one consumer offset is written in zk if I use same ID for two topic consumers.
 
SpoutConfig conf1= new SpoutConfig(zkBrokerHosts, "topic1", zkRoot, "my-topology-name"); 
KafkaSpout spout1= new KafkaSpout(conf);

SpoutConfig conf2= new SpoutConfig(zkBrokerHosts, "topic2", zkRoot, "my-topology-name");KafkaSpout spout2= new KafkaSpout(conf2);

2. In that case, can I set multiple topics in SpoutConfig? such as SpoutConfig conf= new SpoutConfig(zkBrokerHosts, new String[]{"topic1","topic2"}, zkRoot, "my-topology-name");
3. how to figure out how far behind is my consumer?kafka provides a way to check lag
http://community.spiceworks.com/how_to/show/77610-how-far-behind-is-your-kafka-consumer
bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker --group &lt;consumer_group_name&gt; --zkconnect &lt;zookeeper_host:port&gt; --topic &lt;topic_name&gt;
But the way kafkaspout writes offset to kafka is different from traditional consumers so if you run that command, it fails.

Thanks in advance.Sincerely,





Re: Several questions regarding kafkaspout

Posted by Harsha <st...@harsha.io>.

Hi,

    You are using the same id in spoutconfig for both the
spouts "my-topology-name" . KafkaSpout uses  Simple API (kafka
consumer api) to read from kafka and stores offsets into the
zookeeper by itself. It uses SpoutConfig.id as a node path in
zookeeper. Since you are using same id in both places you are
overwriting each consumer offset .



 how to figure out how far behind is my consumer?

since its using simple api offset management is done by
KafkaSpout and writes the offset into different path . So the
kafka.tools.ConsumerOffsetChecker wouldn't work in this case .



-Harsha



On Thu, Oct 16, 2014, at 08:45 PM, 이승진 wrote:

Dear all,

1. For cases when there are multiple topics to read from, do I
have to provide different IDs for each topic?

It seems like only one consumer offset is written in zk if I
use same ID for two topic consumers.

SpoutConfig conf1= new
SpoutConfig(zkBrokerHosts, "topic1", zkRoot, "my-topology-name"
);
KafkaSpout spout1= new KafkaSpout(conf);

SpoutConfig conf2= new
SpoutConfig(zkBrokerHosts, "topic2", zkRoot, "my-topology-name"
);
KafkaSpout spout2= new KafkaSpout(conf2);

2. In that case, can I set multiple topics in SpoutConfig? such
as
SpoutConfig conf= new SpoutConfig(zkBrokerHosts, new
String[]{"topic1","topic2"}, zkRoot, "my-topology-name");

3. how to figure out how far behind is my consumer?
kafka provides a way to check lag
[1]http://community.spiceworks.com/how_to/show/77610-how-far-be
hind-is-your-kafka-consumer
bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker
--group <consumer_group_name> --zkconnect <zookeeper_host:port>
--topic <topic_name>
But the way kafkaspout writes offset to kafka is different from
traditional consumers so if you run that command, it fails.

Thanks in advance.
Sincerely,





  [?img=r9KmFqKrFxuZMxbYaqumpxKdF6iSaAE9FrkCKqMwFrFoMru9MxJ4FoUXa
  zJgMX%2B0Mogq74lRpzM5W4C5bX0q%2BzkR74FTWx%2Fs%2BBF0bvIq%2BzeZWL
  lCbzJo1zE5WXiN.gif]

References

1. http://community.spiceworks.com/how_to/show/77610-how-far-behind-is-your-kafka-consumer