You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Jeyhun Karimov <je...@gmail.com> on 2016/08/10 15:56:12 UTC

Kafka Connect questions

Hi community,

I am using Kafka-Connect to import text files to kafka. Because I have to
give exact file name (like ./data.txt) and not the parent directory (./*)
as an input to Kafka-Connect, how can I import files in particular
directory as they are created in runtime? The solutions that come to my
mind are below but I don't know about their efficiency.
- I may create a new file in particular directory (after the kafka-connect
started already) and start new connect instance to point that file
-or I may write to only one file (just one ./data.txt) all the time (and no
other data files created)  and kafka-connect will transfer the data only
from one file and it will from where it left (as data writes are made)
-or any other way to efficiently handle this use case.

My second question is, is it safe to  override the data transfer methods of
kafka-connect? For example I want to put thread.sleeps in kafka-streams
side while transferring data and see the behaviour in kafka side or in
application side. You can think of as simulation of load.

Cheers
Jeyhun


-- 
-Cheers

Jeyhun

Re: Kafka Connect questions

Posted by Jeyhun Karimov <je...@gmail.com>.
Thanks in advance.

Jeyhun

On Thu, Aug 11, 2016 at 7:09 AM Gwen Shapira <gw...@confluent.io> wrote:

> Maybe you need a different KafkaConnect source? I think this one may
> fit your needs better:
> https://github.com/jcustenborder/kafka-connect-spooldir
>
> It was built to copy data from files in directory into Kafka...
>
>
>
> On Wed, Aug 10, 2016 at 8:56 AM, Jeyhun Karimov <je...@gmail.com>
> wrote:
> > Hi community,
> >
> > I am using Kafka-Connect to import text files to kafka. Because I have to
> > give exact file name (like ./data.txt) and not the parent directory (./*)
> > as an input to Kafka-Connect, how can I import files in particular
> > directory as they are created in runtime? The solutions that come to my
> > mind are below but I don't know about their efficiency.
> > - I may create a new file in particular directory (after the
> kafka-connect
> > started already) and start new connect instance to point that file
> > -or I may write to only one file (just one ./data.txt) all the time (and
> no
> > other data files created)  and kafka-connect will transfer the data only
> > from one file and it will from where it left (as data writes are made)
> > -or any other way to efficiently handle this use case.
> >
> > My second question is, is it safe to  override the data transfer methods
> of
> > kafka-connect? For example I want to put thread.sleeps in kafka-streams
> > side while transferring data and see the behaviour in kafka side or in
> > application side. You can think of as simulation of load.
> >
> > Cheers
> > Jeyhun
> >
> >
> > --
> > -Cheers
> >
> > Jeyhun
>
>
>
> --
> Gwen Shapira
> Product Manager | Confluent
> 650.450.2760 | @gwenshap
> Follow us: Twitter | blog
>
-- 
-Cheers

Jeyhun

Re: Kafka Connect questions

Posted by Gwen Shapira <gw...@confluent.io>.
Maybe you need a different KafkaConnect source? I think this one may
fit your needs better:
https://github.com/jcustenborder/kafka-connect-spooldir

It was built to copy data from files in directory into Kafka...



On Wed, Aug 10, 2016 at 8:56 AM, Jeyhun Karimov <je...@gmail.com> wrote:
> Hi community,
>
> I am using Kafka-Connect to import text files to kafka. Because I have to
> give exact file name (like ./data.txt) and not the parent directory (./*)
> as an input to Kafka-Connect, how can I import files in particular
> directory as they are created in runtime? The solutions that come to my
> mind are below but I don't know about their efficiency.
> - I may create a new file in particular directory (after the kafka-connect
> started already) and start new connect instance to point that file
> -or I may write to only one file (just one ./data.txt) all the time (and no
> other data files created)  and kafka-connect will transfer the data only
> from one file and it will from where it left (as data writes are made)
> -or any other way to efficiently handle this use case.
>
> My second question is, is it safe to  override the data transfer methods of
> kafka-connect? For example I want to put thread.sleeps in kafka-streams
> side while transferring data and see the behaviour in kafka side or in
> application side. You can think of as simulation of load.
>
> Cheers
> Jeyhun
>
>
> --
> -Cheers
>
> Jeyhun



-- 
Gwen Shapira
Product Manager | Confluent
650.450.2760 | @gwenshap
Follow us: Twitter | blog