You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@nifi.apache.org by Matt Burgess <ma...@apache.org> on 2020/07/08 02:38:26 UTC

Re: Processor Extensibility

This is probably better suited for the dev list (not sure if you're
subscribed but please do, BCC'ing users and moving to dev), but the
implementations (components and their NARs) are not designed to be
subclassed for custom extensions outside the codebase, can you
describe your use case (and custom processor)? If there's a common
reusable interface we can talk about moving it to an API NAR and such,
but I believe in general the guidance is to do the copy/paste if you
need code from the existing components in the codebase.

Regards,
Matt

On Tue, Jul 7, 2020 at 10:14 PM Eric Secules <es...@gmail.com> wrote:
>
> Hello,
>
> I was wondering if there was a recommendation on how to extend the functionality of nifi processors without forking the nifi repository. I'm looking for a way to include a processor's nar in my project and extend from it. I'd also like to be able to extend that processor's test suite so I can leverage that. The "solution" I found (if you can call it that) was to copy the code from Validate record.java into a new class and make the changes I wanted to.
>
> Thanks,
> Eric

Re: Processor Extensibility

Posted by Eric Secules <es...@gmail.com>.

I have been implementing extra custom processors in my own nar and then
just install it with all the stock nars. This has been okay for processors
which have no overlap with existing code. But most of the time I am
creating a variation of an existing processor which entails copying the
processor base implementation. Adding a mongo processor for the
findAndUpdate operation is a good example.

I am wondering if I'd be better off forking the nifi repo. Maybe that's the
lesser evil in terms of maintenance hell.

The way I have solved the issue with validation results is to inject a
"validationErrors" key into each invalid record. I don't know if there's a
philosophy against editing the record content, but this way you don't need
to correlate the record with the validation errors in an attribute and you
don't have to worry about truncation. To be honest I havent yet considered
the behavior for CSV and other non-JSON records.

-Eric

On Tue., Jul. 7, 2020, 9:55 p.m. Chris Sampson,
<ch...@naimuri.com.invalid> wrote:

> My solution to the exact same problem was to copy the processor code &
> tests then extend them locally as a new processor (of course this comes
> with overhead of having to maintain the code between releases of nifi). I
> was then (and still am) relatively new to nifi though.
>
> We wanted the details of all validation issues raised for each record
> within a file, but we also split the file before validate. This means lower
> performance, but functionally we needed the audit trail per record and this
> also means the amount of information output from the validator shouldn't
> exceed limits (e.g. attribute value size in the case where a file/record
> contains lots of issues). We've since changed approach and I include
> validation of our data inside a larger custom processor, but effectively do
> the same thing still.
>
> There's a change due in 1.12 for the validator that looks to output the
> validation details (up to a truncated limit) as a flowfile attribute, so
> maybe you could look to backport that change for ease? You may still need
> to look at whether the issue list is in a format you're able to use (i.e.
> it's not a parable json array).
>
>
> Cheers,
>
> Chris Sampson
>
> On Wed, 8 Jul 2020, 05:35 Eric Secules, <es...@gmail.com> wrote:
>
> > In my use case I need to use the validation results later in the flow to
> > generate an error response. Currently the validation results are
> aggregated
> > and formed into a description on the provenance event. This doesn't help
> me
> > unfortunately.
> >
> > -eric
> >
> > On Tue., Jul. 7, 2020, 7:39 p.m. Matt Burgess, <ma...@apache.org>
> > wrote:
> >
> > > This is probably better suited for the dev list (not sure if you're
> > > subscribed but please do, BCC'ing users and moving to dev), but the
> > > implementations (components and their NARs) are not designed to be
> > > subclassed for custom extensions outside the codebase, can you
> > > describe your use case (and custom processor)? If there's a common
> > > reusable interface we can talk about moving it to an API NAR and such,
> > > but I believe in general the guidance is to do the copy/paste if you
> > > need code from the existing components in the codebase.
> > >
> > > Regards,
> > > Matt
> > >
> > > On Tue, Jul 7, 2020 at 10:14 PM Eric Secules <es...@gmail.com>
> wrote:
> > > >
> > > > Hello,
> > > >
> > > > I was wondering if there was a recommendation on how to extend the
> > > functionality of nifi processors without forking the nifi repository.
> I'm
> > > looking for a way to include a processor's nar in my project and extend
> > > from it. I'd also like to be able to extend that processor's test suite
> > so
> > > I can leverage that. The "solution" I found (if you can call it that)
> was
> > > to copy the code from Validate record.java into a new class and make
> the
> > > changes I wanted to.
> > > >
> > > > Thanks,
> > > > Eric
> > >
> >
>

Re: Processor Extensibility

Posted by Chris Sampson <ch...@naimuri.com.INVALID>.

My solution to the exact same problem was to copy the processor code &
tests then extend them locally as a new processor (of course this comes
with overhead of having to maintain the code between releases of nifi). I
was then (and still am) relatively new to nifi though.

We wanted the details of all validation issues raised for each record
within a file, but we also split the file before validate. This means lower
performance, but functionally we needed the audit trail per record and this
also means the amount of information output from the validator shouldn't
exceed limits (e.g. attribute value size in the case where a file/record
contains lots of issues). We've since changed approach and I include
validation of our data inside a larger custom processor, but effectively do
the same thing still.

There's a change due in 1.12 for the validator that looks to output the
validation details (up to a truncated limit) as a flowfile attribute, so
maybe you could look to backport that change for ease? You may still need
to look at whether the issue list is in a format you're able to use (i.e.
it's not a parable json array).

Cheers,

Chris Sampson

On Wed, 8 Jul 2020, 05:35 Eric Secules, <es...@gmail.com> wrote:

> In my use case I need to use the validation results later in the flow to
> generate an error response. Currently the validation results are aggregated
> and formed into a description on the provenance event. This doesn't help me
> unfortunately.
>
> -eric
>
> On Tue., Jul. 7, 2020, 7:39 p.m. Matt Burgess, <ma...@apache.org>
> wrote:
>
> > This is probably better suited for the dev list (not sure if you're
> > subscribed but please do, BCC'ing users and moving to dev), but the
> > implementations (components and their NARs) are not designed to be
> > subclassed for custom extensions outside the codebase, can you
> > describe your use case (and custom processor)? If there's a common
> > reusable interface we can talk about moving it to an API NAR and such,
> > but I believe in general the guidance is to do the copy/paste if you
> > need code from the existing components in the codebase.
> >
> > Regards,
> > Matt
> >
> > On Tue, Jul 7, 2020 at 10:14 PM Eric Secules <es...@gmail.com> wrote:
> > >
> > > Hello,
> > >
> > > I was wondering if there was a recommendation on how to extend the
> > functionality of nifi processors without forking the nifi repository. I'm
> > looking for a way to include a processor's nar in my project and extend
> > from it. I'd also like to be able to extend that processor's test suite
> so
> > I can leverage that. The "solution" I found (if you can call it that) was
> > to copy the code from Validate record.java into a new class and make the
> > changes I wanted to.
> > >
> > > Thanks,
> > > Eric
> >
>

Re: Processor Extensibility

Posted by Eric Secules <es...@gmail.com>.

In my use case I need to use the validation results later in the flow to
generate an error response. Currently the validation results are aggregated
and formed into a description on the provenance event. This doesn't help me
unfortunately.

-eric

On Tue., Jul. 7, 2020, 7:39 p.m. Matt Burgess, <ma...@apache.org> wrote:

> This is probably better suited for the dev list (not sure if you're
> subscribed but please do, BCC'ing users and moving to dev), but the
> implementations (components and their NARs) are not designed to be
> subclassed for custom extensions outside the codebase, can you
> describe your use case (and custom processor)? If there's a common
> reusable interface we can talk about moving it to an API NAR and such,
> but I believe in general the guidance is to do the copy/paste if you
> need code from the existing components in the codebase.
>
> Regards,
> Matt
>
> On Tue, Jul 7, 2020 at 10:14 PM Eric Secules <es...@gmail.com> wrote:
> >
> > Hello,
> >
> > I was wondering if there was a recommendation on how to extend the
> functionality of nifi processors without forking the nifi repository. I'm
> looking for a way to include a processor's nar in my project and extend
> from it. I'd also like to be able to extend that processor's test suite so
> I can leverage that. The "solution" I found (if you can call it that) was
> to copy the code from Validate record.java into a new class and make the
> changes I wanted to.
> >
> > Thanks,
> > Eric
>