You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nifi.apache.org by Clay Teahouse <cl...@gmail.com> on 2017/07/04 15:21:23 UTC

nifi questions

Hello All,

I am new to nifi. I'd appreciate your help with some questions.

1) Is there a  processor like TCP Listen that would work with protobuf
messages? In particular, I am interested in processing protobuf messages
prefixed with the length of the message. I
2) Is there a  processor like Consume MQTT that would work with protobuf
messages?
3) As a general question, as with kafka connect, does nifi have a means for
specifying the converters, declaratively,  or do I need to write a separate
processor for each converter?
4) How does nifi compare to kafka connect, in terms of performance?

thanks
Clay

Re: nifi questions

Posted by Clay Teahouse <cl...@gmail.com>.
Thank you for very helpful feedback.

On Wed, Jul 5, 2017 at 1:46 PM, Joe Witt <jo...@gmail.com> wrote:

> ah!  Good call Mike and thanks for adding that.
>
> On Wed, Jul 5, 2017 at 2:21 PM, Michael Hogue
> <mi...@gmail.com> wrote:
> > Clay,
> >
> >  Regarding number one, Joe is correct. There current isn't a processor
> that
> > can process arbitrary protobuf messages, but InvokeGRPC and ListenGRPC
> were
> > recently added (targeting 1.4.0) that can accept and send gRPC messages
> > (which wrap protobuf) defined by an IDL [1].
> >
> > There's a how-to article with a few examples:
> > https://cwiki.apache.org/confluence/display/NIFI/
> Leveraging+gRPC+Processors
> >
> > It's hard for me to say if either would meet your use case, but i thought
> > i'd at least mention their existence here.
> >
> > Thanks,
> > Mike
> >
> > [1]
> > https://github.com/apache/nifi/tree/master/nifi-nar-
> bundles/nifi-grpc-bundle/nifi-grpc-processors/src/main/resources/proto
> >
> > On Tue, Jul 4, 2017 at 12:08 PM Joe Witt <jo...@gmail.com> wrote:
> >
> >> Clay
> >>
> >> Here some answers to each.  Happy to discuss further.
> >>
> >> #1) No processors exist in the apache nifi codebase to receive or send
> >> data using google protobuf that I know of right now.  This could work
> >> very well with our record oriented format and schema aware
> >> readers/writers though so perhaps it would be a good mode to offer.
> >> If you're interested in contributing in this area please let us know.
> >>
> >> #2) All of our processors that accept messages such as Kafka, MQTT,
> >> amqp, jms, etc.. can bring in any format/schema of data.  They are
> >> effectively just bringing in binary data.  When it comes to
> >> routing/transforming/etc.. that is when it really matters about being
> >> format/schema aware.  We have built in support already for a number of
> >> formats/schemas but more importantly with the recent 1.2.0/1.3.0
> >> releases we've added this record concept I've mentioned.  This lets us
> >> have a series of common patterns/processors to handle records as a
> >> concept and then plugin in various readers/writers which understand
> >> the specifics of serialization/deserialization.  So, it would be easy
> >> to extend the various methods of acquiring and delivering data for
> >> whatever record oriented data you have.  For now though with regard to
> >> protobuf I dont think we have anything out of the box.
> >>
> >> #3) I think my answer in #2 will help and I strongly encourage you to
> >> take a look here [1] and here [2]. In short, yes we offer a ton of
> >> flexibility in how you handle record oriented data.  As is generally
> >> the case in NiFi you should get a great deal of reuse out of existing
> >> capabilities with minimal need to customize.
> >>
> >> #4) I dont know how NiFi compares in performance to Apache Kafka's
> >> Connect concept or to any other project in a generic sense.  What we
> >> know is what NiFi is designed for and the use cases it is used
> >> against.  NiFi and Kafka Connect have very different execution models.
> >> With NiFi for common record oriented use cases including format and
> >> schema aware acquisition, routing, enrichment, conversion, and
> >> delivery of data achieving hundreds of thousands of records per second
> >> throughput is straightforward while also running a number of other
> >> flows on structured and unstructured data as well.  Just depends on
> >> your configuration, needs, and what the appropriate execution model
> >> is.  NiFi offers you a data broker in which you put the logic for how
> >> to handle otherwise decoupled producers and consumers.  Driving data
> >> into out and of out of Kafka with NiFi is very common.
> >>
> >> [1] https://nifi.apache.org/docs/nifi-docs/html/record-path-guide.html
> >> [2] https://blogs.apache.org/nifi/entry/record-oriented-data-with-nifi
> >> [3]
> >> http://bryanbende.com/development/2017/06/20/apache-
> nifi-records-and-schema-registries
> >>
> >> Thanks
> >> Joe
> >>
> >> On Tue, Jul 4, 2017 at 11:21 AM, Clay Teahouse <cl...@gmail.com>
> >> wrote:
> >> > Hello All,
> >> >
> >> > I am new to nifi. I'd appreciate your help with some questions.
> >> >
> >> > 1) Is there a  processor like TCP Listen that would work with protobuf
> >> > messages? In particular, I am interested in processing protobuf
> messages
> >> > prefixed with the length of the message. I
> >> > 2) Is there a  processor like Consume MQTT that would work with
> protobuf
> >> > messages?
> >> > 3) As a general question, as with kafka connect, does nifi have a
> means
> >> for
> >> > specifying the converters, declaratively,  or do I need to write a
> >> separate
> >> > processor for each converter?
> >> > 4) How does nifi compare to kafka connect, in terms of performance?
> >> >
> >> > thanks
> >> > Clay
> >>
>

Re: nifi questions

Posted by Joe Witt <jo...@gmail.com>.
ah!  Good call Mike and thanks for adding that.

On Wed, Jul 5, 2017 at 2:21 PM, Michael Hogue
<mi...@gmail.com> wrote:
> Clay,
>
>  Regarding number one, Joe is correct. There current isn't a processor that
> can process arbitrary protobuf messages, but InvokeGRPC and ListenGRPC were
> recently added (targeting 1.4.0) that can accept and send gRPC messages
> (which wrap protobuf) defined by an IDL [1].
>
> There's a how-to article with a few examples:
> https://cwiki.apache.org/confluence/display/NIFI/Leveraging+gRPC+Processors
>
> It's hard for me to say if either would meet your use case, but i thought
> i'd at least mention their existence here.
>
> Thanks,
> Mike
>
> [1]
> https://github.com/apache/nifi/tree/master/nifi-nar-bundles/nifi-grpc-bundle/nifi-grpc-processors/src/main/resources/proto
>
> On Tue, Jul 4, 2017 at 12:08 PM Joe Witt <jo...@gmail.com> wrote:
>
>> Clay
>>
>> Here some answers to each.  Happy to discuss further.
>>
>> #1) No processors exist in the apache nifi codebase to receive or send
>> data using google protobuf that I know of right now.  This could work
>> very well with our record oriented format and schema aware
>> readers/writers though so perhaps it would be a good mode to offer.
>> If you're interested in contributing in this area please let us know.
>>
>> #2) All of our processors that accept messages such as Kafka, MQTT,
>> amqp, jms, etc.. can bring in any format/schema of data.  They are
>> effectively just bringing in binary data.  When it comes to
>> routing/transforming/etc.. that is when it really matters about being
>> format/schema aware.  We have built in support already for a number of
>> formats/schemas but more importantly with the recent 1.2.0/1.3.0
>> releases we've added this record concept I've mentioned.  This lets us
>> have a series of common patterns/processors to handle records as a
>> concept and then plugin in various readers/writers which understand
>> the specifics of serialization/deserialization.  So, it would be easy
>> to extend the various methods of acquiring and delivering data for
>> whatever record oriented data you have.  For now though with regard to
>> protobuf I dont think we have anything out of the box.
>>
>> #3) I think my answer in #2 will help and I strongly encourage you to
>> take a look here [1] and here [2]. In short, yes we offer a ton of
>> flexibility in how you handle record oriented data.  As is generally
>> the case in NiFi you should get a great deal of reuse out of existing
>> capabilities with minimal need to customize.
>>
>> #4) I dont know how NiFi compares in performance to Apache Kafka's
>> Connect concept or to any other project in a generic sense.  What we
>> know is what NiFi is designed for and the use cases it is used
>> against.  NiFi and Kafka Connect have very different execution models.
>> With NiFi for common record oriented use cases including format and
>> schema aware acquisition, routing, enrichment, conversion, and
>> delivery of data achieving hundreds of thousands of records per second
>> throughput is straightforward while also running a number of other
>> flows on structured and unstructured data as well.  Just depends on
>> your configuration, needs, and what the appropriate execution model
>> is.  NiFi offers you a data broker in which you put the logic for how
>> to handle otherwise decoupled producers and consumers.  Driving data
>> into out and of out of Kafka with NiFi is very common.
>>
>> [1] https://nifi.apache.org/docs/nifi-docs/html/record-path-guide.html
>> [2] https://blogs.apache.org/nifi/entry/record-oriented-data-with-nifi
>> [3]
>> http://bryanbende.com/development/2017/06/20/apache-nifi-records-and-schema-registries
>>
>> Thanks
>> Joe
>>
>> On Tue, Jul 4, 2017 at 11:21 AM, Clay Teahouse <cl...@gmail.com>
>> wrote:
>> > Hello All,
>> >
>> > I am new to nifi. I'd appreciate your help with some questions.
>> >
>> > 1) Is there a  processor like TCP Listen that would work with protobuf
>> > messages? In particular, I am interested in processing protobuf messages
>> > prefixed with the length of the message. I
>> > 2) Is there a  processor like Consume MQTT that would work with protobuf
>> > messages?
>> > 3) As a general question, as with kafka connect, does nifi have a means
>> for
>> > specifying the converters, declaratively,  or do I need to write a
>> separate
>> > processor for each converter?
>> > 4) How does nifi compare to kafka connect, in terms of performance?
>> >
>> > thanks
>> > Clay
>>

Re: nifi questions

Posted by Michael Hogue <mi...@gmail.com>.
Clay,

 Regarding number one, Joe is correct. There current isn't a processor that
can process arbitrary protobuf messages, but InvokeGRPC and ListenGRPC were
recently added (targeting 1.4.0) that can accept and send gRPC messages
(which wrap protobuf) defined by an IDL [1].

There's a how-to article with a few examples:
https://cwiki.apache.org/confluence/display/NIFI/Leveraging+gRPC+Processors

It's hard for me to say if either would meet your use case, but i thought
i'd at least mention their existence here.

Thanks,
Mike

[1]
https://github.com/apache/nifi/tree/master/nifi-nar-bundles/nifi-grpc-bundle/nifi-grpc-processors/src/main/resources/proto

On Tue, Jul 4, 2017 at 12:08 PM Joe Witt <jo...@gmail.com> wrote:

> Clay
>
> Here some answers to each.  Happy to discuss further.
>
> #1) No processors exist in the apache nifi codebase to receive or send
> data using google protobuf that I know of right now.  This could work
> very well with our record oriented format and schema aware
> readers/writers though so perhaps it would be a good mode to offer.
> If you're interested in contributing in this area please let us know.
>
> #2) All of our processors that accept messages such as Kafka, MQTT,
> amqp, jms, etc.. can bring in any format/schema of data.  They are
> effectively just bringing in binary data.  When it comes to
> routing/transforming/etc.. that is when it really matters about being
> format/schema aware.  We have built in support already for a number of
> formats/schemas but more importantly with the recent 1.2.0/1.3.0
> releases we've added this record concept I've mentioned.  This lets us
> have a series of common patterns/processors to handle records as a
> concept and then plugin in various readers/writers which understand
> the specifics of serialization/deserialization.  So, it would be easy
> to extend the various methods of acquiring and delivering data for
> whatever record oriented data you have.  For now though with regard to
> protobuf I dont think we have anything out of the box.
>
> #3) I think my answer in #2 will help and I strongly encourage you to
> take a look here [1] and here [2]. In short, yes we offer a ton of
> flexibility in how you handle record oriented data.  As is generally
> the case in NiFi you should get a great deal of reuse out of existing
> capabilities with minimal need to customize.
>
> #4) I dont know how NiFi compares in performance to Apache Kafka's
> Connect concept or to any other project in a generic sense.  What we
> know is what NiFi is designed for and the use cases it is used
> against.  NiFi and Kafka Connect have very different execution models.
> With NiFi for common record oriented use cases including format and
> schema aware acquisition, routing, enrichment, conversion, and
> delivery of data achieving hundreds of thousands of records per second
> throughput is straightforward while also running a number of other
> flows on structured and unstructured data as well.  Just depends on
> your configuration, needs, and what the appropriate execution model
> is.  NiFi offers you a data broker in which you put the logic for how
> to handle otherwise decoupled producers and consumers.  Driving data
> into out and of out of Kafka with NiFi is very common.
>
> [1] https://nifi.apache.org/docs/nifi-docs/html/record-path-guide.html
> [2] https://blogs.apache.org/nifi/entry/record-oriented-data-with-nifi
> [3]
> http://bryanbende.com/development/2017/06/20/apache-nifi-records-and-schema-registries
>
> Thanks
> Joe
>
> On Tue, Jul 4, 2017 at 11:21 AM, Clay Teahouse <cl...@gmail.com>
> wrote:
> > Hello All,
> >
> > I am new to nifi. I'd appreciate your help with some questions.
> >
> > 1) Is there a  processor like TCP Listen that would work with protobuf
> > messages? In particular, I am interested in processing protobuf messages
> > prefixed with the length of the message. I
> > 2) Is there a  processor like Consume MQTT that would work with protobuf
> > messages?
> > 3) As a general question, as with kafka connect, does nifi have a means
> for
> > specifying the converters, declaratively,  or do I need to write a
> separate
> > processor for each converter?
> > 4) How does nifi compare to kafka connect, in terms of performance?
> >
> > thanks
> > Clay
>

Re: nifi questions

Posted by Joe Witt <jo...@gmail.com>.
Clay

Here some answers to each.  Happy to discuss further.

#1) No processors exist in the apache nifi codebase to receive or send
data using google protobuf that I know of right now.  This could work
very well with our record oriented format and schema aware
readers/writers though so perhaps it would be a good mode to offer.
If you're interested in contributing in this area please let us know.

#2) All of our processors that accept messages such as Kafka, MQTT,
amqp, jms, etc.. can bring in any format/schema of data.  They are
effectively just bringing in binary data.  When it comes to
routing/transforming/etc.. that is when it really matters about being
format/schema aware.  We have built in support already for a number of
formats/schemas but more importantly with the recent 1.2.0/1.3.0
releases we've added this record concept I've mentioned.  This lets us
have a series of common patterns/processors to handle records as a
concept and then plugin in various readers/writers which understand
the specifics of serialization/deserialization.  So, it would be easy
to extend the various methods of acquiring and delivering data for
whatever record oriented data you have.  For now though with regard to
protobuf I dont think we have anything out of the box.

#3) I think my answer in #2 will help and I strongly encourage you to
take a look here [1] and here [2]. In short, yes we offer a ton of
flexibility in how you handle record oriented data.  As is generally
the case in NiFi you should get a great deal of reuse out of existing
capabilities with minimal need to customize.

#4) I dont know how NiFi compares in performance to Apache Kafka's
Connect concept or to any other project in a generic sense.  What we
know is what NiFi is designed for and the use cases it is used
against.  NiFi and Kafka Connect have very different execution models.
With NiFi for common record oriented use cases including format and
schema aware acquisition, routing, enrichment, conversion, and
delivery of data achieving hundreds of thousands of records per second
throughput is straightforward while also running a number of other
flows on structured and unstructured data as well.  Just depends on
your configuration, needs, and what the appropriate execution model
is.  NiFi offers you a data broker in which you put the logic for how
to handle otherwise decoupled producers and consumers.  Driving data
into out and of out of Kafka with NiFi is very common.

[1] https://nifi.apache.org/docs/nifi-docs/html/record-path-guide.html
[2] https://blogs.apache.org/nifi/entry/record-oriented-data-with-nifi
[3] http://bryanbende.com/development/2017/06/20/apache-nifi-records-and-schema-registries

Thanks
Joe

On Tue, Jul 4, 2017 at 11:21 AM, Clay Teahouse <cl...@gmail.com> wrote:
> Hello All,
>
> I am new to nifi. I'd appreciate your help with some questions.
>
> 1) Is there a  processor like TCP Listen that would work with protobuf
> messages? In particular, I am interested in processing protobuf messages
> prefixed with the length of the message. I
> 2) Is there a  processor like Consume MQTT that would work with protobuf
> messages?
> 3) As a general question, as with kafka connect, does nifi have a means for
> specifying the converters, declaratively,  or do I need to write a separate
> processor for each converter?
> 4) How does nifi compare to kafka connect, in terms of performance?
>
> thanks
> Clay