You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Fredrik Emilsson <Fr...@mblox.com> on 2012/08/10 16:06:48 UTC

Hadoop-consumer & partition question

Hello,

 

  I have a topic that have two partitions. I have one broker. Adding
events is working well, there are events both in the 0 and 1 partition.
The problem arise when I try to import it into Hadoop. It seems that
only the events in partition 0 is imported. I am using a script (found
here:
http://felixgv.com/post/69/automating-incremental-imports-with-the-kafka
-hadoop-consumer/) to import it.

 

  Does anyone know what the problem could be?

 

  Regards,

    Fredrik

 

 


NOTICE - This message and any attached files may contain information that is confidential and/or subject of legal privilege intended only for use by the intended recipient. If you are not the intended recipient or the person responsible for delivering the message to the intended recipient, be advised that you have received this message in error and that any dissemination, copying or use of this message or attachment is strictly forbidden, as is the disclosure of the information therein.  If you have received this message in error please notify the sender immediately and delete the message.

Re: Hadoop-consumer & partition question

Posted by Felix GV <fe...@mate1inc.com>.
I haven't used this script in a while, but if I remember correctly, you
should have a different offset file for each broker/partition combination...

In any case, the article you linked to is an outdated version of that
script (as mentioned in the block at the very beginning of the post, BTW).

A quick look at the script you linked to shows that it manages only a
single offset file, which would explain why you're consuming just one
partition. The latest version of the script manages all of the offset
files, so I think that should solve your problem.

You can find the latest version of the script here:
https://gist.github.com/1671887

--
Felix



On Mon, Aug 13, 2012 at 2:39 AM, Fredrik Emilsson <
Fredrik.Emilsson@mblox.com> wrote:

> Thanks for the information!
>
> Here is my offset file:
>
> SEQkafka.etl.KafkaETLKey"org.apache.hadoop.io.BytesWritableh*À¦ê$ßíãðÒF.tcp://localhost:9092
>    device-events-2 0       5133247
>
> Is the 0 in the file the partition? If so then I have to identify how to
> set the partition number. I guess 0 is the default value.
>
>   Regards,
>     Fredrk
>
> -----Original Message-----
> From: Neha Narkhede [mailto:neha.narkhede@gmail.com]
> Sent: 10 August 2012 17:34
> To: kafka-users@incubator.apache.org
> Subject: Re: Hadoop-consumer & partition question
>
> The Hadoop consumer uses an offsets file to know which partitions to
> consume and from which offset. How does you offsets file look like ?
>
> Thanks,
> Neha
>
> On Fri, Aug 10, 2012 at 7:06 AM, Fredrik Emilsson <
> Fredrik.Emilsson@mblox.com> wrote:
> > Hello,
> >
> >
> >
> >   I have a topic that have two partitions. I have one broker. Adding
> > events is working well, there are events both in the 0 and 1 partition.
> > The problem arise when I try to import it into Hadoop. It seems that
> > only the events in partition 0 is imported. I am using a script (found
> > here:
> > http://felixgv.com/post/69/automating-incremental-imports-with-the-kaf
> > ka
> > -hadoop-consumer/) to import it.
> >
> >
> >
> >   Does anyone know what the problem could be?
> >
> >
> >
> >   Regards,
> >
> >     Fredrik
> >
> >
> >
> >
> >
> >
> > NOTICE - This message and any attached files may contain information
> that is confidential and/or subject of legal privilege intended only for
> use by the intended recipient. If you are not the intended recipient or the
> person responsible for delivering the message to the intended recipient, be
> advised that you have received this message in error and that any
> dissemination, copying or use of this message or attachment is strictly
> forbidden, as is the disclosure of the information therein.  If you have
> received this message in error please notify the sender immediately and
> delete the message.
>

RE: Hadoop-consumer & partition question

Posted by Fredrik Emilsson <Fr...@mblox.com>.
Thanks for the information!

Here is my offset file:

SEQkafka.etl.KafkaETLKey"org.apache.hadoop.io.BytesWritableh*À¦ê$ßíãðÒF.tcp://localhost:9092    device-events-2 0       5133247

Is the 0 in the file the partition? If so then I have to identify how to set the partition number. I guess 0 is the default value. 

  Regards,
    Fredrk

-----Original Message-----
From: Neha Narkhede [mailto:neha.narkhede@gmail.com] 
Sent: 10 August 2012 17:34
To: kafka-users@incubator.apache.org
Subject: Re: Hadoop-consumer & partition question

The Hadoop consumer uses an offsets file to know which partitions to consume and from which offset. How does you offsets file look like ?

Thanks,
Neha

On Fri, Aug 10, 2012 at 7:06 AM, Fredrik Emilsson <Fr...@mblox.com> wrote:
> Hello,
>
>
>
>   I have a topic that have two partitions. I have one broker. Adding 
> events is working well, there are events both in the 0 and 1 partition.
> The problem arise when I try to import it into Hadoop. It seems that 
> only the events in partition 0 is imported. I am using a script (found
> here:
> http://felixgv.com/post/69/automating-incremental-imports-with-the-kaf
> ka
> -hadoop-consumer/) to import it.
>
>
>
>   Does anyone know what the problem could be?
>
>
>
>   Regards,
>
>     Fredrik
>
>
>
>
>
>
> NOTICE - This message and any attached files may contain information that is confidential and/or subject of legal privilege intended only for use by the intended recipient. If you are not the intended recipient or the person responsible for delivering the message to the intended recipient, be advised that you have received this message in error and that any dissemination, copying or use of this message or attachment is strictly forbidden, as is the disclosure of the information therein.  If you have received this message in error please notify the sender immediately and delete the message.

Re: Hadoop-consumer & partition question

Posted by Neha Narkhede <ne...@gmail.com>.
The Hadoop consumer uses an offsets file to know which partitions to
consume and from which offset. How does you offsets file look like ?

Thanks,
Neha

On Fri, Aug 10, 2012 at 7:06 AM, Fredrik Emilsson
<Fr...@mblox.com> wrote:
> Hello,
>
>
>
>   I have a topic that have two partitions. I have one broker. Adding
> events is working well, there are events both in the 0 and 1 partition.
> The problem arise when I try to import it into Hadoop. It seems that
> only the events in partition 0 is imported. I am using a script (found
> here:
> http://felixgv.com/post/69/automating-incremental-imports-with-the-kafka
> -hadoop-consumer/) to import it.
>
>
>
>   Does anyone know what the problem could be?
>
>
>
>   Regards,
>
>     Fredrik
>
>
>
>
>
>
> NOTICE - This message and any attached files may contain information that is confidential and/or subject of legal privilege intended only for use by the intended recipient. If you are not the intended recipient or the person responsible for delivering the message to the intended recipient, be advised that you have received this message in error and that any dissemination, copying or use of this message or attachment is strictly forbidden, as is the disclosure of the information therein.  If you have received this message in error please notify the sender immediately and delete the message.