You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by David Newberger <da...@wandcorp.com> on 2016/07/15 15:24:01 UTC

Kafka Connect or Streams Output to local File System

Hello All,

I'm curious if I can output to a .txt file after doing some stream processing using Kafka Streams. The scenario I'm trying to implement is a simple web log processing application with alerts on specific criteria. I know I can ingest log files from the local filesystem into Kafka using connect. I also believe I can use Kafka Streams to process the logs looking for the specific criteria to be met.  

Where I'm having difficulty is telling if I can either output the info which meets the criteria to one local file system .txt file and the raw unprocessed logs to another local file system text file directly from Kafka Steams or Kafka Connect. I'd like to output the 2 files to the local file system because this is a simple proof of concept application which I'd like to keep from using other tools in the chain if possible.

Cheers!

David Newberger


RE: Kafka Connect or Streams Output to local File System

Posted by David Newberger <da...@wandcorp.com>.
Hi Eno and Guozhang,

Thank you both! I'm glad to know it's possible now I'll work on implementing the POC.

David Newberger


-----Original Message-----
From: Guozhang Wang [mailto:wangguoz@gmail.com] 
Sent: Friday, July 15, 2016 11:43 AM
To: users@kafka.apache.org
Subject: Re: Kafka Connect or Streams Output to local File System

Hi David,

As Eno said, you can use Kafka Streams to pipe the raw logs as well as those anomaly events to two different topics, and use separate Kafka Connect to read them into two files.

An alternative way to use Kafka Streams only, is to use the `printAsText` operator in the Streams DSL, or you can also implement your own processor implementation with the Processor API to print the stream into local files, if that is acceptable. You can take a look at this example in the web docs (search for "Applying a custom processor"):

http://docs.confluent.io/3.0.0/streams/developer-guide.html#kafka-streams-dsl

Guozhang


On Fri, Jul 15, 2016 at 9:20 AM, Eno Thereska <en...@gmail.com>
wrote:

> Hi David,
>
> One option would be to first output your info to a topic using Kafka 
> Streams, and then use Connect again (as a sink) to read from the topic 
> and write to a file in the file system.
>
> Eno
>
> > On 15 Jul 2016, at 08:24, David Newberger 
> > <da...@wandcorp.com>
> wrote:
> >
> > Hello All,
> >
> > I'm curious if I can output to a .txt file after doing some stream
> processing using Kafka Streams. The scenario I'm trying to implement 
> is a simple web log processing application with alerts on specific 
> criteria. I know I can ingest log files from the local filesystem into 
> Kafka using connect. I also believe I can use Kafka Streams to process 
> the logs looking for the specific criteria to be met.
> >
> > Where I'm having difficulty is telling if I can either output the 
> > info
> which meets the criteria to one local file system .txt file and the 
> raw unprocessed logs to another local file system text file directly 
> from Kafka Steams or Kafka Connect. I'd like to output the 2 files to 
> the local file system because this is a simple proof of concept 
> application which I'd like to keep from using other tools in the chain if possible.
> >
> > Cheers!
> >
> > David Newberger
> >
>
>


--
-- Guozhang

Re: Kafka Connect or Streams Output to local File System

Posted by Guozhang Wang <wa...@gmail.com>.
Hi David,

As Eno said, you can use Kafka Streams to pipe the raw logs as well as
those anomaly events to two different topics, and use separate Kafka
Connect to read them into two files.

An alternative way to use Kafka Streams only, is to use the `printAsText`
operator in the Streams DSL, or you can also implement your own processor
implementation with the Processor API to print the stream into local files,
if that is acceptable. You can take a look at this example in the web docs
(search for "Applying a custom processor"):

http://docs.confluent.io/3.0.0/streams/developer-guide.html#kafka-streams-dsl

Guozhang


On Fri, Jul 15, 2016 at 9:20 AM, Eno Thereska <en...@gmail.com>
wrote:

> Hi David,
>
> One option would be to first output your info to a topic using Kafka
> Streams, and then use Connect again (as a sink) to read from the topic and
> write to a file in the file system.
>
> Eno
>
> > On 15 Jul 2016, at 08:24, David Newberger <da...@wandcorp.com>
> wrote:
> >
> > Hello All,
> >
> > I'm curious if I can output to a .txt file after doing some stream
> processing using Kafka Streams. The scenario I'm trying to implement is a
> simple web log processing application with alerts on specific criteria. I
> know I can ingest log files from the local filesystem into Kafka using
> connect. I also believe I can use Kafka Streams to process the logs looking
> for the specific criteria to be met.
> >
> > Where I'm having difficulty is telling if I can either output the info
> which meets the criteria to one local file system .txt file and the raw
> unprocessed logs to another local file system text file directly from Kafka
> Steams or Kafka Connect. I'd like to output the 2 files to the local file
> system because this is a simple proof of concept application which I'd like
> to keep from using other tools in the chain if possible.
> >
> > Cheers!
> >
> > David Newberger
> >
>
>


-- 
-- Guozhang

Re: Kafka Connect or Streams Output to local File System

Posted by Eno Thereska <en...@gmail.com>.
Hi David,

One option would be to first output your info to a topic using Kafka Streams, and then use Connect again (as a sink) to read from the topic and write to a file in the file system.

Eno

> On 15 Jul 2016, at 08:24, David Newberger <da...@wandcorp.com> wrote:
> 
> Hello All,
> 
> I'm curious if I can output to a .txt file after doing some stream processing using Kafka Streams. The scenario I'm trying to implement is a simple web log processing application with alerts on specific criteria. I know I can ingest log files from the local filesystem into Kafka using connect. I also believe I can use Kafka Streams to process the logs looking for the specific criteria to be met.  
> 
> Where I'm having difficulty is telling if I can either output the info which meets the criteria to one local file system .txt file and the raw unprocessed logs to another local file system text file directly from Kafka Steams or Kafka Connect. I'd like to output the 2 files to the local file system because this is a simple proof of concept application which I'd like to keep from using other tools in the chain if possible.
> 
> Cheers!
> 
> David Newberger
>