You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nifi.apache.org by Phil H <gi...@gmail.com> on 2022/03/16 22:58:48 UTC
SplitContent doesn’t support regex?
Hi,
This seems like an odd omission - aside from performance (presumably?) is
there a reason why there isn’t a regex option for the byte sequence? I need
one but thought I’d ask before I built my own.
Thanks
Phil
Re: SplitContent doesn’t support regex?
Posted by Mark Payne <ma...@hotmail.com>.
Phil,
Yeah, that’s fine. We want to include the Jira number in the commit message, but you can include multiple by writing a message like:
NIFI-3470, NIFI-1517: Addressed Thing #1 and Thing #2
Thanks
-Mark
> On Apr 4, 2022, at 10:27 AM, Phil H <gi...@gmail.com> wrote:
>
> Whilst I try and get NiFi to build, let's circle back to JIRA. I found an
> open issue that matches my requirement (NIFI-1517), however to implement my
> solution, I'd also fix NIFI-3470 on the way (reading a configurable amount
> of data to run the regex over, rather than byte-by-byte).
>
> So, what's the proper way to go about this from a JIRA perspective? I
> assume my branch would be nifi-1517 as that's the feature I'm building, but
> it would also "solve" 3470?
>
> TIA,
> Phil
>
>
>
>
> On Thu, Mar 17, 2022 at 9:12 AM Joe Witt <jo...@gmail.com> wrote:
>
>> Phil
>>
>> I'd say if you have a good implementation in mind you should go for it.
>> Sounds interesting.
>>
>> Thanks
>>
>> On Wed, Mar 16, 2022 at 3:59 PM Phil H <gi...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> This seems like an odd omission - aside from performance (presumably?) is
>>> there a reason why there isn’t a regex option for the byte sequence? I
>> need
>>> one but thought I’d ask before I built my own.
>>>
>>> Thanks
>>> Phil
>>>
>>
Re: SplitContent doesn’t support regex?
Posted by Phil H <gi...@gmail.com>.
Whilst I try and get NiFi to build, let's circle back to JIRA. I found an
open issue that matches my requirement (NIFI-1517), however to implement my
solution, I'd also fix NIFI-3470 on the way (reading a configurable amount
of data to run the regex over, rather than byte-by-byte).
So, what's the proper way to go about this from a JIRA perspective? I
assume my branch would be nifi-1517 as that's the feature I'm building, but
it would also "solve" 3470?
TIA,
Phil
On Thu, Mar 17, 2022 at 9:12 AM Joe Witt <jo...@gmail.com> wrote:
> Phil
>
> I'd say if you have a good implementation in mind you should go for it.
> Sounds interesting.
>
> Thanks
>
> On Wed, Mar 16, 2022 at 3:59 PM Phil H <gi...@gmail.com> wrote:
>
> > Hi,
> >
> > This seems like an odd omission - aside from performance (presumably?) is
> > there a reason why there isn’t a regex option for the byte sequence? I
> need
> > one but thought I’d ask before I built my own.
> >
> > Thanks
> > Phil
> >
>
Re: SplitContent doesn’t support regex?
Posted by Otto Fowler <ot...@gmail.com>.
Joe, I don’t know if we can make the case for a stand alone processor for doing this on top of that, if so, I’d be willing to take a look at that.
From: Otto Fowler <ot...@gmail.com>
Reply: Otto Fowler <ot...@gmail.com>
Date: March 19, 2022 at 13:40:07
To: dev@nifi.apache.org <de...@nifi.apache.org>, Phil H <gi...@gmail.com>
Subject: Re: SplitContent doesn’t support regex?
In the Apache Metron Project (in the attic now) we used https://github.com/nishihatapalmer/byteseek to do pcap searches, maybe you can check that out.
From: Phil H <gi...@gmail.com>
Reply: dev@nifi.apache.org <de...@nifi.apache.org>
Date: March 16, 2022 at 20:04:58
To: dev@nifi.apache.org <de...@nifi.apache.org>
Subject: Re: SplitContent doesn’t support regex?
I dunno about a good implementation…
I did a similar extension of GetTCP to allow for a regex EOM rather than a
single byte. It works, but I don’t feel like it was done in the spirit of
the existing processor!
On Thu, 17 Mar 2022 at 09:12, Joe Witt <jo...@gmail.com> wrote:
> Phil
>
> I'd say if you have a good implementation in mind you should go for it.
> Sounds interesting.
>
> Thanks
>
> On Wed, Mar 16, 2022 at 3:59 PM Phil H <gi...@gmail.com> wrote:
>
> > Hi,
> >
> > This seems like an odd omission - aside from performance (presumably?) is
> > there a reason why there isn’t a regex option for the byte sequence? I
> need
> > one but thought I’d ask before I built my own.
> >
> > Thanks
> > Phil
> >
>
Re: SplitContent doesn’t support regex?
Posted by Otto Fowler <ot...@gmail.com>.
In the Apache Metron Project (in the attic now) we used
https://github.com/nishihatapalmer/byteseek to do pcap searches, maybe you
can check that out.
From: Phil H <gi...@gmail.com> <gi...@gmail.com>
Reply: dev@nifi.apache.org <de...@nifi.apache.org> <de...@nifi.apache.org>
Date: March 16, 2022 at 20:04:58
To: dev@nifi.apache.org <de...@nifi.apache.org> <de...@nifi.apache.org>
Subject: Re: SplitContent doesn’t support regex?
I dunno about a good implementation…
I did a similar extension of GetTCP to allow for a regex EOM rather than a
single byte. It works, but I don’t feel like it was done in the spirit of
the existing processor!
On Thu, 17 Mar 2022 at 09:12, Joe Witt <jo...@gmail.com> wrote:
> Phil
>
> I'd say if you have a good implementation in mind you should go for it.
> Sounds interesting.
>
> Thanks
>
> On Wed, Mar 16, 2022 at 3:59 PM Phil H <gi...@gmail.com> wrote:
>
> > Hi,
> >
> > This seems like an odd omission - aside from performance (presumably?)
is
> > there a reason why there isn’t a regex option for the byte sequence? I
> need
> > one but thought I’d ask before I built my own.
> >
> > Thanks
> > Phil
> >
>
Re: SplitContent doesn’t support regex?
Posted by Phil H <gi...@gmail.com>.
I dunno about a good implementation…
I did a similar extension of GetTCP to allow for a regex EOM rather than a
single byte. It works, but I don’t feel like it was done in the spirit of
the existing processor!
On Thu, 17 Mar 2022 at 09:12, Joe Witt <jo...@gmail.com> wrote:
> Phil
>
> I'd say if you have a good implementation in mind you should go for it.
> Sounds interesting.
>
> Thanks
>
> On Wed, Mar 16, 2022 at 3:59 PM Phil H <gi...@gmail.com> wrote:
>
> > Hi,
> >
> > This seems like an odd omission - aside from performance (presumably?) is
> > there a reason why there isn’t a regex option for the byte sequence? I
> need
> > one but thought I’d ask before I built my own.
> >
> > Thanks
> > Phil
> >
>
Re: SplitContent doesn’t support regex?
Posted by Joe Witt <jo...@gmail.com>.
Phil
I'd say if you have a good implementation in mind you should go for it.
Sounds interesting.
Thanks
On Wed, Mar 16, 2022 at 3:59 PM Phil H <gi...@gmail.com> wrote:
> Hi,
>
> This seems like an odd omission - aside from performance (presumably?) is
> there a reason why there isn’t a regex option for the byte sequence? I need
> one but thought I’d ask before I built my own.
>
> Thanks
> Phil
>