You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nifi.apache.org by Yolanda Davis <yo...@gmail.com> on 2019/07/30 12:55:40 UTC

[DISCUSS] Predictive Analytics for NiFi Metrics

Hello Everyone,

I wanted to reach out to the community to discuss potentially enhancing
NiFi to include predictive analytics that can help users assess and predict
NiFi behavior and performance. Currently NiFi has lots of metrics available
for areas including jvm and flow component usage (via component status) as
well as provenance data which NiFi makes available either through the UI or
reporting tasks (for consumption by other systems). Past discussions in the
community cite users shipping this data to applications such as Prometheus,
ELK stacks, or Ambari metrics for further analysis in order to
capture/review performance issues, detect anomalies, and send alerts or
notifications.  These systems are efficient in capturing and helping to
analyze these metrics however it requires customization work and knowledge
of NiFi operations to provide meaningful analytics within a flow context.

In speaking with Matt Burgess and Andy Christianson on this topic we feel
that there is an opportunity to introduce an analytics framework that could
provide users reasonable predictions on key performance indicators for
flows, such as back pressure and flow rate, to help administrators improve
operational management of NiFi clusters.  This framework could offer
several key features:

   - Provide a flexible internal analytics engine and model api which
   supports the addition of or enhancement to onboard models
   - Support integration of remote or cloud based ML models
   - Support both traditional and online (incremental) learning methods
   - Provide support for model caching  (perhaps later inclusion into a
   model repository or registry)
   - UI enhancements to display prediction information either in existing
   summary data, new data visualizations, or directly within the flow/canvas
   (where applicable)

For an initial target we thought that back pressure prediction would be a
good starting point for this initiative, given that back pressure detection
is a key indicator of flow performance and many of the metrics currently
available would provide enough data points to create a reasonable
performing model.  We have some ideas on how this could be achieved however
we wanted to discuss this more with the community to get thoughts about
tackling this work, especially if there are specific use cases or other
factors that should be considered.

Looking forward to everyone's thoughts and input.

Thanks,

-yolanda

--
yolanda.m.davis@gmail.com
@YolandaMDavis

Re: [DISCUSS] Predictive Analytics for NiFi Metrics

Posted by Yolanda Davis <yo...@gmail.com>.
Hi Craig,

Thanks for your feedback and insight on your use cases.  What version of
MiNiFi are you running?  Concerning performing edge ML this may be possible
for you with MiNiFi C++ version 0.6.0.  That release supports the creation
of python processors which can be added to your flow to execute models.
Andy Christianson sent me info from a blog written by Marc Parisi on this
topic here: https://www.parisi.io/index.php/2019/03/27/hey-bro/

In creating an analytics framework for models we may look to simplify
things further where instead of creating a processor for models you could
perhaps just implement a simple interface and rely on the engine to execute
things as needed.  But for now perhaps the python processsor could help
fill the gap for you?

-yolanda


On Wed, Jul 31, 2019 at 6:01 AM Craig Knell <cr...@gmail.com> wrote:

> Sounds. Great
>
> Let me know if you need some help
>
> Best regards
>
> Craig
>
>
>
> > On 31 Jul 2019, at 17:31, Arpad Boda <ab...@cloudera.com.invalid> wrote:
> >
> > Craig,
> >
> > OPC ( https://issues.apache.org/jira/browse/MINIFICPP-819 ) and Modbus (
> > https://issues.apache.org/jira/browse/MINIFICPP-897 ) are on the way for
> > MiNiFi c++, hopefully both will be part of next release (0.7.0).
> > It's gonna be legen... wait for it! :)
> >
> > Regards,
> > Arpad
> >
> >> On Wed, Jul 31, 2019 at 2:30 AM Craig Knell <cr...@gmail.com>
> wrote:
> >>
> >> Hi Folks
> >>
> >> That's our use case now.  All our Models are run in python.
> >> Currently we send events to the ML via http, although this is not
> optimal
> >>
> >> Our use case is edge ML where we want a light weight wrapper for
> >> Python code base.
> >> Jython however does not work with the code base
> >> I'm think of changing the interface to some thing like REDIS for pub/sub
> >> Id also like this to be a push deployment via minifi
> >>
> >> Also support for sensors via protocols via Modbus and OPC would be great
> >>
> >> Craig
> >>
> >>> On Wed, Jul 31, 2019 at 1:43 AM Joe Witt <jo...@gmail.com> wrote:
> >>>
> >>> Definitely something that I think would really help the community.  It
> >>> might make sense to frame/structure these APIs such that an internal
> >> option
> >>> could be available to reduce dependencies and get up and running but
> that
> >>> also just as easily a remote implementation where the engine lives and
> is
> >>> managed externally could also be supported.
> >>>
> >>> Thanks
> >>>
> >>>
> >>> On Tue, Jul 30, 2019 at 1:40 PM Andy LoPresto <al...@apache.org>
> >> wrote:
> >>>
> >>>> Yolanda,
> >>>>
> >>>> I think this sounds like a great idea and will be very useful to
> >>>> admins/users, as well as enabling some interesting next-level
> >> functionality
> >>>> and insight generation. Thanks for putting this out there.
> >>>>
> >>>> Andy LoPresto
> >>>> alopresto@apache.org
> >>>> alopresto.apache@gmail.com
> >>>> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
> >>>>
> >>>>> On Jul 30, 2019, at 5:55 AM, Yolanda Davis <
> >> yolanda.m.davis@gmail.com>
> >>>> wrote:
> >>>>>
> >>>>> Hello Everyone,
> >>>>>
> >>>>> I wanted to reach out to the community to discuss potentially
> >> enhancing
> >>>>> NiFi to include predictive analytics that can help users assess and
> >>>> predict
> >>>>> NiFi behavior and performance. Currently NiFi has lots of metrics
> >>>> available
> >>>>> for areas including jvm and flow component usage (via component
> >> status)
> >>>> as
> >>>>> well as provenance data which NiFi makes available either through
> >> the UI
> >>>> or
> >>>>> reporting tasks (for consumption by other systems). Past discussions
> >> in
> >>>> the
> >>>>> community cite users shipping this data to applications such as
> >>>> Prometheus,
> >>>>> ELK stacks, or Ambari metrics for further analysis in order to
> >>>>> capture/review performance issues, detect anomalies, and send alerts
> >> or
> >>>>> notifications.  These systems are efficient in capturing and helping
> >> to
> >>>>> analyze these metrics however it requires customization work and
> >>>> knowledge
> >>>>> of NiFi operations to provide meaningful analytics within a flow
> >> context.
> >>>>>
> >>>>> In speaking with Matt Burgess and Andy Christianson on this topic we
> >> feel
> >>>>> that there is an opportunity to introduce an analytics framework that
> >>>> could
> >>>>> provide users reasonable predictions on key performance indicators
> >> for
> >>>>> flows, such as back pressure and flow rate, to help administrators
> >>>> improve
> >>>>> operational management of NiFi clusters.  This framework could offer
> >>>>> several key features:
> >>>>>
> >>>>>  - Provide a flexible internal analytics engine and model api which
> >>>>>  supports the addition of or enhancement to onboard models
> >>>>>  - Support integration of remote or cloud based ML models
> >>>>>  - Support both traditional and online (incremental) learning
> >> methods
> >>>>>  - Provide support for model caching  (perhaps later inclusion into
> >> a
> >>>>>  model repository or registry)
> >>>>>  - UI enhancements to display prediction information either in
> >> existing
> >>>>>  summary data, new data visualizations, or directly within the
> >>>> flow/canvas
> >>>>>  (where applicable)
> >>>>>
> >>>>> For an initial target we thought that back pressure prediction would
> >> be a
> >>>>> good starting point for this initiative, given that back pressure
> >>>> detection
> >>>>> is a key indicator of flow performance and many of the metrics
> >> currently
> >>>>> available would provide enough data points to create a reasonable
> >>>>> performing model.  We have some ideas on how this could be achieved
> >>>> however
> >>>>> we wanted to discuss this more with the community to get thoughts
> >> about
> >>>>> tackling this work, especially if there are specific use cases or
> >> other
> >>>>> factors that should be considered.
> >>>>>
> >>>>> Looking forward to everyone's thoughts and input.
> >>>>>
> >>>>> Thanks,
> >>>>>
> >>>>> -yolanda
> >>>>>
> >>>>> --
> >>>>> yolanda.m.davis@gmail.com
> >>>>> @YolandaMDavis
> >>>>
> >>>>
> >>
> >>
> >>
> >> --
> >> Regards
> >>
> >> Craig Knell
> >> Mobile: +61 402 128 615
> >> Skype: craigknell
> >>
>


-- 
--
yolanda.m.davis@gmail.com
@YolandaMDavis

Re: Re:[EXT] [DISCUSS] Predictive Analytics for NiFi Metrics

Posted by Andy Christianson <ai...@protonmail.com.INVALID>.
Hi Rob,

Thanks for the UI PR. Taking a look.

-Andy

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Tuesday, August 20, 2019 10:04 AM, Robert Fellows <ro...@gmail.com> wrote:

> Mark and Yolanda,
> I submitted the PR I mentioned yesterday for the UI changes that surface
> the exposed prediction data. Let me know what you think.
>
> https://github.com/apache/nifi/pull/3660
>
> Thanks,
> Rob
>
> On Mon, Aug 19, 2019 at 4:17 PM Yolanda Davis yolanda.m.davis@gmail.com
> wrote:
>
> > Hi Mark and Rob
> > Mark thanks so much for the info on your work and Rob thanks for jumping in
> > on the UI! I just wanted to add, Mark, that looking at your branch I think
> > we also may have some opportunities to exchange notes or collaborate on the
> > backend as well. The work in the feature branch is still in progress (with
> > some decoupling to ensure we can allow flexible configuration of models).
> > Please feel free to review and leave comments under the parent JIRA. At
> > the same time I'll take a deeper dive on your branch and perhaps we can
> > exchange notes on potential areas for improvement/collaboration if it makes
> > sense?
> > Thanks Again,
> > -yolanda
> > On Mon, Aug 19, 2019 at 3:34 PM Robert Fellows rob.fellows@gmail.com
> > wrote:
> >
> > > Hey Mark,
> > > I've started working on some UI based on the initial commit for this
> > > proposal. What you have done and what I am working on have a bit of
> > > overlap, but not much.
> > > I'm working on getting the predicted count and bytes into the existing
> > > connection metric display that is already on the canvas. The only overlap
> > > looks like it might be in the
> > > Summary table. I plan on adding a PR for my additions hopefully tomorrow.
> > > Maybe once it is up we can discuss how we bring the them together where
> > > it
> > > makes sense?
> > > This is the main JIRA case:
> > > https://issues.apache.org/jira/browse/NIFI-6510
> > > And this is the subtask that I am working toward:
> > > https://issues.apache.org/jira/browse/NIFI-6568
> > > -- Rob Fellows
> > > On Mon, Aug 19, 2019 at 2:26 PM Owens, Mark jmowens@evoforge.org
> > > wrote:
> > >
> > > > The images from the preview email do not appear to be displaying. They
> > > > can
> > > > be viewed at:
> > > > https://github.com/jmark99/nifi-images
> > > > From: Owens, Mark jmowens@evoforge.org
> > > > Sent: Monday, August 19, 2019 2:25 PM
> > > > To: dev@nifi.apache.org
> > > > Subject: RE: Re:[EXT] [DISCUSS] Predictive Analytics for NiFi Metrics
> > > > Hi Yolanda,
> > > > I've been working on a feature that appears to possibly overlap with
> > > > the
> > >
> > > > work you are pursuing. Perhaps we should see if/should we try to
> > > > coordinate
> > > > our efforts. I've been updating NiFi to predict the time to queue
> > > > overflow
> > > > for both flowfiles and bytes and displaying that information in the
> > > > GUI.
> > >
> > > > For the initial attempt, I’ve been using a simple model of straight
> > > > line
> > >
> > > > prediction over a sliding window of 15 minutes to predict when flows
> > > > will
> > >
> > > > fail. This estimate is then displayed on both the NiFi Summary page
> > > > under
> > >
> > > > the connections tab and in the status history graphs. Below are
> > > > examples
> > >
> > > > of what would be displayed to the user.
> > > > [cid:image001.png@01D55696.E4CCD550]
> > > > The Connection tab contains a new column on the right that displays the
> > > > prediction for both flow files and data size. The user can select a
> > > > maximum
> > > > time at which specific times are no longer displayed. In this example,
> > > > if
> > >
> > > > the prediction lies beyond 12 hours then the display simply indicates
> > > > that
> > > > the flow is greater than 12 hours away from failure at the moment.
> > > > [cid:image002.png@01D55697.2C8AC500]
> > > > This display graphs the prediction for byte overflow over time. Note
> > > > that
> > >
> > > > if the estimate is greater than the user provided maximum value of
> > > > interest
> > > > the graph maxes out at that time, effectively indicating no overflow
> > > > concerns.
> > > > [cid:image003.png@01D55697.965C27D0]
> > > > A similar display for flowfile count is displayed as well.
> > > > The current state of work can be found at
> > > > https://github.com/jmark99/nifi/tree/time-to-overflow
> > > > I welcome your (or any others) feedback on this effort.
> > > > Thanks,
> > > > Mark
> > > > P.S. If the images are not displaying, they can be viewed at
> > > > https://github.com/jmark99/nifi-images
> > > > -----Original Message-----
> > > > From: Yolanda Davis <yolanda.m.davis@gmail.com<mailto:
> > > > yolanda.m.davis@gmail.com>>
> > > > Sent: Monday, August 19, 2019 11:29 AM
> > > > To: dev@nifi.apache.orgmailto:dev@nifi.apache.org
> > > > Subject: Re:[EXT] [DISCUSS] Predictive Analytics for NiFi Metrics
> > > > Hello All,
> > > > I just wanted to follow up on the discussion we started a couple of
> > > > weeks
> > >
> > > > ago concerning an analytics framework for NiFi metrics. Working with
> > > > Andy
> > > > Christianson and Matt Burgess we shaped our ideas and drafted a
> > > > proposal
> > >
> > > > for this feature on the Apache NiFi Wiki [1] . We've also begun
> > > > implementing some of these ideas in a feature branch (which is work in
> > > > progress) [2]. We’d appreciate any questions or feedback you may have.
> > > > Thanks,
> > > > -yolanda
> > > > [1] -
> >
> > https://cwiki.apache.org/confluence/display/NIFI/Operational+Analytics+Framework+for+NiFi
> >
> > > > [2] - https://github.com/apache/nifi/commits/analytics-framework
> > > > On Wed, Jul 31, 2019 at 9:58 AM Andy Christianson <
> > > > aichrist@protonmail.com
> > > > .invalidmailto:aichrist@protonmail.com.invalid> wrote:
> > > >
> > > > > As someone who operated a 24/7 mission-critical NiFi flow, this
> > > >
> > > > > feature would have been a life saver. If I'm heading home on a
> > > > > Friday,
> > >
> > > > > it would be great to have some blinking red lights to let me know
> > > > > that
> > >
> > > > > the system predicts that it is going to experience backpressure
> > > >
> > > > > sometime over the weekend, so that corrective action could be taken
> > > > > before leaving.
> > > >
> > > > >
> > > >
> > > > > Since there is support in the community for this, I created a JIRA to
> > > >
> > > > > track the effort:
> > > >
> > > > >
> > > >
> > > > > https://issues.apache.org/jira/browse/NIFI-6510
> > > >
> > > > >
> > > >
> > > > > I also created a JIRA to track the remote protocol:
> > > >
> > > > >
> > > >
> > > > > https://issues.apache.org/jira/browse/NIFI-6511
> > > >
> > > > >
> > > >
> > > > >
> > > >
> > > > > Regards,
> > > >
> > > > >
> > > >
> > > > > Andy
> > > >
> > > > >
> > > >
> > > > >
> > > >
> > > > > Sent from ProtonMail, Swiss-based encrypted email.
> > > >
> > > > >
> > > >
> > > > > ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> > > >
> > > > > On Wednesday, July 31, 2019 6:57 AM, Arpad Boda <aboda@apache.org
> > > > > mailto:aboda@apache.org> wrote:
> > > >
> > > > >
> > > >
> > > > > > If you could share a bit more details about your OPC and Modbus
> > > >
> > > > > > usage,
> > > >
> > > > > that
> > > >
> > > > > > would be highly appreciated!
> > > >
> > > > > >
> > > >
> > > > > > On Wed, Jul 31, 2019 at 12:01 PM Craig Knell craig.knell@gmail.com
> > > > > > mailto:craig.knell@gmail.com
> > > >
> > > > > wrote:
> > > >
> > > > > >
> > > >
> > > > > > > Sounds. Great
> > > >
> > > > > > > Let me know if you need some help
> > > >
> > > > > > > Best regards
> > > >
> > > > > > > Craig
> > > >
> > > > > > >
> > > >
> > > > > > > > On 31 Jul 2019, at 17:31, Arpad Boda aboda@cloudera.com.invalid
> > > > > > > > mailto:aboda@cloudera.com.invalid
> > > >
> > > > > wrote:
> > > >
> > > > > > > > Craig,
> > > >
> > > > > > > > OPC ( https://issues.apache.org/jira/browse/MINIFICPP-819 )
> > > > > > > > and
> > >
> > > > > Modbus (
> > > >
> > > > > > > > https://issues.apache.org/jira/browse/MINIFICPP-897 ) are on
> > > > > > > > the
> > >
> > > > > way for
> > > >
> > > > > > > > MiNiFi c++, hopefully both will be part of next release
> > > > > > > > (0.7.0).
> > >
> > > > > > > > It's gonna be legen... wait for it! :) Regards, Arpad
> > > >
> > > > > > > >
> > > >
> > > > > > > > > On Wed, Jul 31, 2019 at 2:30 AM Craig Knell
> > > >
> > > > > > > > > craig.knell@gmail.commailto:craig.knell@gmail.com
> > > >
> > > > > > > > > wrote:
> > > >
> > > > > > > >
> > > >
> > > > > > > > > Hi Folks
> > > >
> > > > > > > > > That's our use case now. All our Models are run in python.
> > > >
> > > > > > > > > Currently we send events to the ML via http, although this is
> > > >
> > > > > > > > > not optimal
> > > >
> > > > > > > >
> > > >
> > > > > > > > > Our use case is edge ML where we want a light weight wrapper
> > > >
> > > > > > > > > for Python code base.
> > > >
> > > > > > > > > Jython however does not work with the code base I'm think of
> > > >
> > > > > > > > > changing the interface to some thing like REDIS for
> > > >
> > > > > pub/sub
> > > >
> > > > > > > > > Id also like this to be a push deployment via minifi Also
> > > >
> > > > > > > > > support for sensors via protocols via Modbus and OPC would be
> > > >
> > > > > great
> > > >
> > > > > > > > > Craig
> > > >
> > > > > > > > >
> > > >
> > > > > > > > > > On Wed, Jul 31, 2019 at 1:43 AM Joe Witt
> > > > > > > > > > joe.witt@gmail.com
> > >
> > > > mailto:joe.witt@gmail.com
> > > >
> > > > > wrote:
> > > >
> > > > > > > > > > Definitely something that I think would really help the
> > > >
> > > > > community. It
> > > >
> > > > > > > > > > might make sense to frame/structure these APIs such that an
> > > >
> > > > > internal
> > > >
> > > > > > > > > > option
> > > >
> > > > > > > > > > could be available to reduce dependencies and get up and
> > > >
> > > > > > > > > > running
> > > >
> > > > > but
> > > >
> > > > > > > > > > that
> > > >
> > > > > > > >
> > > >
> > > > > > > > > > also just as easily a remote implementation where the
> > > > > > > > > > engine
> > >
> > > > > lives and
> > > >
> > > > > > > > > > is
> > > >
> > > > > > > >
> > > >
> > > > > > > > > > managed externally could also be supported.
> > > >
> > > > > > > > > > Thanks
> > > >
> > > > > > > > > > On Tue, Jul 30, 2019 at 1:40 PM Andy LoPresto
> > > >
> > > > > alopresto@apache.orgmailto:alopresto@apache.org
> > > >
> > > > > > > > > > wrote:
> > > >
> > > > > > > > > >
> > > >
> > > > > > > > > > > Yolanda,
> > > >
> > > > > > > > > > > I think this sounds like a great idea and will be very
> > > >
> > > > > > > > > > > useful
> > > >
> > > > > to
> > > >
> > > > > > > > > > > admins/users, as well as enabling some interesting
> > > >
> > > > > > > > > > > next-level functionality
> > > >
> > > > > > > > > >
> > > >
> > > > > > > > > > > and insight generation. Thanks for putting this out
> > > > > > > > > > > there.
> > >
> > > > > > > > > > > Andy LoPresto
> > > >
> > > > > > > > > > > alopresto@apache.orgmailto:alopresto@apache.org
> > > >
> > > > > > > > > > > alopresto.apache@gmail.com<mailto:
> > > > > > > > > > > alopresto.apache@gmail.com>
> > > > > > > > > > > PGP Fingerprint: 70EC B3E5 98A6
> > > >
> > > > > > > > > > > 5A3F D3C4 BACE 3C6E F65B 2F7D
> > > >
> > > > > EF69
> > > >
> > > > > > > > > > >
> > > >
> > > > > > > > > > > > On Jul 30, 2019, at 5:55 AM, Yolanda Davis <
> > > >
> > > > > > > > > > > > yolanda.m.davis@gmail.com<mailto:
> > > > > > > > > > > > yolanda.m.davis@gmail.com
> > > >
> > > > > >
> > > >
> > > > > > > > > >
> > > >
> > > > > > > > > > > wrote:
> > > >
> > > > > > > > > > >
> > > >
> > > > > > > > > > > > Hello Everyone,
> > > >
> > > > > > > > > > > > I wanted to reach out to the community to discuss
> > > >
> > > > > > > > > > > > potentially enhancing
> > > >
> > > > > > > > > >
> > > >
> > > > > > > > > > > > NiFi to include predictive analytics that can help
> > > > > > > > > > > > users
> > >
> > > > > assess and
> > > >
> > > > > > > > > > > > predict
> > > >
> > > > > > > > > > > > NiFi behavior and performance. Currently NiFi has lots
> > > >
> > > > > > > > > > > > of
> > > >
> > > > > metrics
> > > >
> > > > > > > > > > > > available
> > > >
> > > > > > > > > > > > for areas including jvm and flow component usage (via
> > > >
> > > > > component
> > > >
> > > > > > > > > > > > status)
> > > >
> > > > > > > > > >
> > > >
> > > > > > > > > > > as
> > > >
> > > > > > > > > > >
> > > >
> > > > > > > > > > > > well as provenance data which NiFi makes available
> > > >
> > > > > > > > > > > > either
> > > >
> > > > > through
> > > >
> > > > > > > > > > > > the UI
> > > >
> > > > > > > > > >
> > > >
> > > > > > > > > > > or
> > > >
> > > > > > > > > > >
> > > >
> > > > > > > > > > > > reporting tasks (for consumption by other systems).
> > > > > > > > > > > > Past
> > >
> > > > > discussions
> > > >
> > > > > > > > > > > > in
> > > >
> > > > > > > > > >
> > > >
> > > > > > > > > > > the
> > > >
> > > > > > > > > > >
> > > >
> > > > > > > > > > > > community cite users shipping this data to applications
> > > >
> > > > > > > > > > > > such
> > > >
> > > > > as
> > > >
> > > > > > > > > > > > Prometheus,
> > > >
> > > > > > > > > > > > ELK stacks, or Ambari metrics for further analysis in
> > > >
> > > > > > > > > > > > order
> > > >
> > > > > to
> > > >
> > > > > > > > > > > > capture/review performance issues, detect anomalies,
> > > > > > > > > > > > and
> > >
> > > > > send alerts
> > > >
> > > > > > > > > > > > or
> > > >
> > > > > > > > > >
> > > >
> > > > > > > > > > > > notifications. These systems are efficient in capturing
> > > >
> > > > > > > > > > > > and
> > > >
> > > > > helping
> > > >
> > > > > > > > > > > > to
> > > >
> > > > > > > > > >
> > > >
> > > > > > > > > > > > analyze these metrics however it requires customization
> > > >
> > > > > > > > > > > > work
> > > >
> > > > > and
> > > >
> > > > > > > > > > > > knowledge
> > > >
> > > > > > > > > > > > of NiFi operations to provide meaningful analytics
> > > >
> > > > > > > > > > > > within a
> > > >
> > > > > flow
> > > >
> > > > > > > > > > > > context.
> > > >
> > > > > > > > > >
> > > >
> > > > > > > > > > > > In speaking with Matt Burgess and Andy Christianson on
> > > >
> > > > > > > > > > > > this
> > > >
> > > > > topic we
> > > >
> > > > > > > > > > > > feel
> > > >
> > > > > > > > > >
> > > >
> > > > > > > > > > > > that there is an opportunity to introduce an analytics
> > > >
> > > > > framework that
> > > >
> > > > > > > > > > > > could
> > > >
> > > > > > > > > > > > provide users reasonable predictions on key performance
> > > >
> > > > > indicators
> > > >
> > > > > > > > > > > > for
> > > >
> > > > > > > > > >
> > > >
> > > > > > > > > > > > flows, such as back pressure and flow rate, to help
> > > >
> > > > > administrators
> > > >
> > > > > > > > > > > > improve
> > > >
> > > > > > > > > > > > operational management of NiFi clusters. This framework
> > > >
> > > > > could offer
> > > >
> > > > > > > > > > > > several key features:
> > > >
> > > > > > > > > > > >
> > > >
> > > > > > > > > > > > -   Provide a flexible internal analytics engine and
> > > > > > > > > > > >     model
> > > > > > > > > > > >
> > > >
> > > > > api which
> > > >
> > > > > > > > > > > >     supports the addition of or enhancement to onboard
> > > > > > > > > > > >
> > > >
> > > > > > > > > > > > models
> > > >
> > > > > > > > > > > >
> > > >
> > > > > > > > > > > > -   Support integration of remote or cloud based ML
> > > > > > > > > > > >     models
> > > > > > > > > > > >
> > > >
> > > > > > > > > > > > -   Support both traditional and online (incremental)
> > > >
> > > > > learning
> > > >
> > > > > > > > > > > >     methods
> > > > > > > > > > > >
> > > >
> > > > > > > > > > > >
> > > >
> > > > > > > > > >
> > > >
> > > > > > > > > > > > -   Provide support for model caching (perhaps later
> > > >
> > > > > inclusion into
> > > >
> > > > > > > > > > > >     a
> > > > > > > > > > > >
> > > >
> > > > > > > > > > > >
> > > >
> > > > > > > > > >
> > > >
> > > > > > > > > > > > model repository or registry)
> > > >
> > > > > > > > > > > >
> > > >
> > > > > > > > > > > > -   UI enhancements to display prediction information
> > > > > > > > > > > >     either
> > > > > > > > > > > >
> > > >
> > > > > in
> > > >
> > > > > > > > > > > >     existing
> > > > > > > > > > > >
> > > >
> > > > > > > > > > > >
> > > >
> > > > > > > > > >
> > > >
> > > > > > > > > > > > summary data, new data visualizations, or directly
> > > >
> > > > > > > > > > > > within the flow/canvas (where applicable) For an
> > > > > > > > > > > > initial
> > >
> > > > > > > > > > > > target we thought that back pressure
> > > >
> > > > > prediction would
> > > >
> > > > > > > > > > > > be a
> > > >
> > > > > > > > > >
> > > >
> > > > > > > > > > > > good starting point for this initiative, given that
> > > > > > > > > > > > back
> > >
> > > > > pressure
> > > >
> > > > > > > > > > > > detection
> > > >
> > > > > > > > > > > > is a key indicator of flow performance and many of the
> > > >
> > > > > metrics
> > > >
> > > > > > > > > > > > currently
> > > >
> > > > > > > > > >
> > > >
> > > > > > > > > > > > available would provide enough data points to create a
> > > >
> > > > > reasonable
> > > >
> > > > > > > > > > > > performing model. We have some ideas on how this could
> > > >
> > > > > > > > > > > > be
> > > >
> > > > > achieved
> > > >
> > > > > > > > > > > > however
> > > >
> > > > > > > > > > > > we wanted to discuss this more with the community to
> > > > > > > > > > > > get
> > >
> > > > > thoughts
> > > >
> > > > > > > > > > > > about
> > > >
> > > > > > > > > >
> > > >
> > > > > > > > > > > > tackling this work, especially if there are specific
> > > > > > > > > > > > use
> > >
> > > > > cases or
> > > >
> > > > > > > > > > > > other
> > > >
> > > > > > > > > >
> > > >
> > > > > > > > > > > > factors that should be considered.
> > > >
> > > > > > > > > > > > Looking forward to everyone's thoughts and input.
> > > >
> > > > > > > > > > > > Thanks,
> > > >
> > > > > > > > > > > > -yolanda
> > > >
> > > > > > > > > > > > --
> > > >
> > > > > > > > > > > > yolanda.m.davis@gmail.com<mailto:
> > > > > > > > > > > > yolanda.m.davis@gmail.com>
> > > > > > > > > > > > @YolandaMDavis
> > > >
> > > > > > > > >
> > > >
> > > > > > > > > --
> > > >
> > > > > > > > > Regards
> > > >
> > > > > > > > > Craig Knell
> > > >
> > > > > > > > > Mobile: +61 402 128 615
> > > >
> > > > > > > > > Skype: craigknell
> > > >
> > > > >
> > > >
> > > > >
> > > >
> > > > >
> > > >
> > > > --
> > > > --
> > > > yolanda.m.davis@gmail.commailto:yolanda.m.davis@gmail.com
> > > > @YolandaMDavis
> > >
> > > --
> > >
> > > ---
> > >
> > > Rob Fellows
> >
> > --
> >
> > ---
> >
> > yolanda.m.davis@gmail.com
> > @YolandaMDavis
>
> --
>
> Rob Fellows



Re: Re:[EXT] [DISCUSS] Predictive Analytics for NiFi Metrics

Posted by Robert Fellows <ro...@gmail.com>.
Mark and Yolanda,
  I submitted the PR I mentioned yesterday for the UI changes that surface
the exposed prediction data. Let me know what you think.

https://github.com/apache/nifi/pull/3660

Thanks,
Rob


On Mon, Aug 19, 2019 at 4:17 PM Yolanda Davis <yo...@gmail.com>
wrote:

> Hi Mark and Rob
>
> Mark thanks so much for the info on your work and Rob thanks for jumping in
> on the UI! I just wanted to add, Mark, that looking at your branch I think
> we also may have some opportunities to exchange notes or collaborate on the
> backend as well.  The work in the feature branch is still in progress (with
> some decoupling to ensure we can allow flexible configuration of models).
> Please feel free to review and leave comments under the parent JIRA.  At
> the same time I'll take a deeper dive on your branch and perhaps we can
> exchange notes on potential areas for improvement/collaboration if it makes
> sense?
>
> Thanks Again,
>
> -yolanda
>
>
> On Mon, Aug 19, 2019 at 3:34 PM Robert Fellows <ro...@gmail.com>
> wrote:
>
> > Hey Mark,
> >   I've started working on some UI based on the initial commit for this
> > proposal. What you have done and what I am working on have a bit of
> > overlap, but not much.
> > I'm working on getting the predicted count and bytes into the existing
> > connection metric display that is already on the canvas. The only overlap
> > looks like it might be in the
> > Summary table. I plan on adding a PR for my additions hopefully tomorrow.
> > Maybe once it is up we can discuss how we bring the them together where
> it
> > makes sense?
> >
> > This is the main JIRA case:
> > https://issues.apache.org/jira/browse/NIFI-6510
> > And this is the subtask that I am working toward:
> > https://issues.apache.org/jira/browse/NIFI-6568
> >
> >
> > -- Rob Fellows
> >
> > On Mon, Aug 19, 2019 at 2:26 PM Owens, Mark <jm...@evoforge.org>
> wrote:
> >
> > > The images from the preview email do not appear to be displaying. They
> > can
> > > be viewed at:
> > > https://github.com/jmark99/nifi-images
> > >
> > > From: Owens, Mark <jm...@evoforge.org>
> > > Sent: Monday, August 19, 2019 2:25 PM
> > > To: dev@nifi.apache.org
> > > Subject: RE: Re:[EXT] [DISCUSS] Predictive Analytics for NiFi Metrics
> > >
> > >
> > > Hi Yolanda,
> > >
> > >
> > >
> > > I've been working on a feature that appears to possibly overlap with
> the
> > > work you are pursuing. Perhaps we should see if/should we try to
> > coordinate
> > > our efforts. I've been updating NiFi to predict the time to queue
> > overflow
> > > for both flowfiles and bytes and displaying that information in the
> GUI.
> > > For the initial attempt, I’ve been using a simple model of straight
> line
> > > prediction over a sliding window of 15 minutes to predict when flows
> will
> > > fail. This estimate is then displayed on both the NiFi Summary page
> under
> > > the connections tab and in the status history graphs.  Below are
> examples
> > > of what would be displayed to the user.
> > >
> > >
> > >
> > > [cid:image001.png@01D55696.E4CCD550]
> > >
> > >
> > >
> > > The Connection tab contains a new column on the right that displays the
> > > prediction for both flow files and data size. The user can select a
> > maximum
> > > time at which specific times are no longer displayed. In this example,
> if
> > > the prediction lies beyond 12 hours then the display simply indicates
> > that
> > > the flow is greater than 12 hours away from failure at the moment.
> > >
> > >
> > >
> > > [cid:image002.png@01D55697.2C8AC500]
> > >
> > >
> > >
> > > This display graphs the prediction for byte overflow over time. Note
> that
> > > if the estimate is greater than the user provided maximum value of
> > interest
> > > the graph maxes out at that time, effectively indicating no overflow
> > > concerns.
> > >
> > >
> > >
> > > [cid:image003.png@01D55697.965C27D0]
> > >
> > >
> > >
> > > A similar display for flowfile count is displayed as well.
> > >
> > >
> > >
> > > The current state of work can be found at
> > > https://github.com/jmark99/nifi/tree/time-to-overflow
> > >
> > >
> > >
> > > I welcome your (or any others) feedback on this effort.
> > >
> > >
> > >
> > > Thanks,
> > > Mark
> > >
> > >
> > >
> > > P.S. If the images are not displaying, they can be viewed at
> > > https://github.com/jmark99/nifi-images
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > -----Original Message-----
> > > From: Yolanda Davis <yolanda.m.davis@gmail.com<mailto:
> > > yolanda.m.davis@gmail.com>>
> > > Sent: Monday, August 19, 2019 11:29 AM
> > > To: dev@nifi.apache.org<ma...@nifi.apache.org>
> > > Subject: Re:[EXT] [DISCUSS] Predictive Analytics for NiFi Metrics
> > >
> > >
> > >
> > > Hello All,
> > >
> > >
> > >
> > > I just wanted to follow up on the discussion we started a couple of
> weeks
> > > ago concerning an analytics framework for NiFi metrics.  Working with
> > Andy
> > > Christianson and Matt Burgess we shaped our ideas and drafted a
> proposal
> > > for this feature on the Apache NiFi Wiki [1] . We've also begun
> > > implementing some of these ideas in a feature branch (which is work in
> > >
> > > progress) [2].  We’d appreciate any questions or feedback you may have.
> > >
> > >
> > >
> > > Thanks,
> > >
> > >
> > >
> > > -yolanda
> > >
> > >
> > >
> > > [1] -
> > >
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/NIFI/Operational+Analytics+Framework+for+NiFi
> > >
> > > [2] - https://github.com/apache/nifi/commits/analytics-framework
> > >
> > >
> > >
> > > On Wed, Jul 31, 2019 at 9:58 AM Andy Christianson <
> > aichrist@protonmail.com
> > > .invalid<ma...@protonmail.com.invalid>> wrote:
> > >
> > >
> > >
> > > > As someone who operated a 24/7 mission-critical NiFi flow, this
> > >
> > > > feature would have been a life saver. If I'm heading home on a
> Friday,
> > >
> > > > it would be great to have some blinking red lights to let me know
> that
> > >
> > > > the system predicts that it is going to experience backpressure
> > >
> > > > sometime over the weekend, so that corrective action could be taken
> > > before leaving.
> > >
> > > >
> > >
> > > > Since there is support in the community for this, I created a JIRA to
> > >
> > > > track the effort:
> > >
> > > >
> > >
> > > > https://issues.apache.org/jira/browse/NIFI-6510
> > >
> > > >
> > >
> > > > I also created a JIRA to track the remote protocol:
> > >
> > > >
> > >
> > > > https://issues.apache.org/jira/browse/NIFI-6511
> > >
> > > >
> > >
> > > >
> > >
> > > > Regards,
> > >
> > > >
> > >
> > > > Andy
> > >
> > > >
> > >
> > > >
> > >
> > > > Sent from ProtonMail, Swiss-based encrypted email.
> > >
> > > >
> > >
> > > > ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> > >
> > > > On Wednesday, July 31, 2019 6:57 AM, Arpad Boda <aboda@apache.org
> > > <ma...@apache.org>> wrote:
> > >
> > > >
> > >
> > > > > If you could share a bit more details about your OPC and Modbus
> > >
> > > > > usage,
> > >
> > > > that
> > >
> > > > > would be highly appreciated!
> > >
> > > > >
> > >
> > > > > On Wed, Jul 31, 2019 at 12:01 PM Craig Knell craig.knell@gmail.com
> > > <ma...@gmail.com>
> > >
> > > > wrote:
> > >
> > > > >
> > >
> > > > > > Sounds. Great
> > >
> > > > > > Let me know if you need some help
> > >
> > > > > > Best regards
> > >
> > > > > > Craig
> > >
> > > > > >
> > >
> > > > > > > On 31 Jul 2019, at 17:31, Arpad Boda aboda@cloudera.com.invalid
> > > <ma...@cloudera.com.invalid>
> > >
> > > > wrote:
> > >
> > > > > > > Craig,
> > >
> > > > > > > OPC ( https://issues.apache.org/jira/browse/MINIFICPP-819 )
> and
> > >
> > > > Modbus (
> > >
> > > > > > > https://issues.apache.org/jira/browse/MINIFICPP-897 ) are on
> the
> > >
> > > > way for
> > >
> > > > > > > MiNiFi c++, hopefully both will be part of next release
> (0.7.0).
> > >
> > > > > > > It's gonna be legen... wait for it! :) Regards, Arpad
> > >
> > > > > > >
> > >
> > > > > > > > On Wed, Jul 31, 2019 at 2:30 AM Craig Knell
> > >
> > > > > > > > craig.knell@gmail.com<ma...@gmail.com>
> > >
> > > > > > > > wrote:
> > >
> > > > > > >
> > >
> > > > > > > > Hi Folks
> > >
> > > > > > > > That's our use case now. All our Models are run in python.
> > >
> > > > > > > > Currently we send events to the ML via http, although this is
> > >
> > > > > > > > not optimal
> > >
> > > > > > >
> > >
> > > > > > > > Our use case is edge ML where we want a light weight wrapper
> > >
> > > > > > > > for Python code base.
> > >
> > > > > > > > Jython however does not work with the code base I'm think of
> > >
> > > > > > > > changing the interface to some thing like REDIS for
> > >
> > > > pub/sub
> > >
> > > > > > > > Id also like this to be a push deployment via minifi Also
> > >
> > > > > > > > support for sensors via protocols via Modbus and OPC would be
> > >
> > > > great
> > >
> > > > > > > > Craig
> > >
> > > > > > > >
> > >
> > > > > > > > > On Wed, Jul 31, 2019 at 1:43 AM Joe Witt
> joe.witt@gmail.com
> > > <ma...@gmail.com>
> > >
> > > > wrote:
> > >
> > > > > > > > > Definitely something that I think would really help the
> > >
> > > > community. It
> > >
> > > > > > > > > might make sense to frame/structure these APIs such that an
> > >
> > > > internal
> > >
> > > > > > > > > option
> > >
> > > > > > > > > could be available to reduce dependencies and get up and
> > >
> > > > > > > > > running
> > >
> > > > but
> > >
> > > > > > > > > that
> > >
> > > > > > >
> > >
> > > > > > > > > also just as easily a remote implementation where the
> engine
> > >
> > > > lives and
> > >
> > > > > > > > > is
> > >
> > > > > > >
> > >
> > > > > > > > > managed externally could also be supported.
> > >
> > > > > > > > > Thanks
> > >
> > > > > > > > > On Tue, Jul 30, 2019 at 1:40 PM Andy LoPresto
> > >
> > > > alopresto@apache.org<ma...@apache.org>
> > >
> > > > > > > > > wrote:
> > >
> > > > > > > > >
> > >
> > > > > > > > > > Yolanda,
> > >
> > > > > > > > > > I think this sounds like a great idea and will be very
> > >
> > > > > > > > > > useful
> > >
> > > > to
> > >
> > > > > > > > > > admins/users, as well as enabling some interesting
> > >
> > > > > > > > > > next-level functionality
> > >
> > > > > > > > >
> > >
> > > > > > > > > > and insight generation. Thanks for putting this out
> there.
> > >
> > > > > > > > > > Andy LoPresto
> > >
> > > > > > > > > > alopresto@apache.org<ma...@apache.org>
> > >
> > > > > > > > > > alopresto.apache@gmail.com<mailto:
> > alopresto.apache@gmail.com>
> > > PGP Fingerprint: 70EC B3E5 98A6
> > >
> > > > > > > > > > 5A3F D3C4 BACE 3C6E F65B 2F7D
> > >
> > > > EF69
> > >
> > > > > > > > > >
> > >
> > > > > > > > > > > On Jul 30, 2019, at 5:55 AM, Yolanda Davis <
> > >
> > > > > > > > > > > yolanda.m.davis@gmail.com<mailto:
> > yolanda.m.davis@gmail.com
> > > >>
> > >
> > > > > > > > >
> > >
> > > > > > > > > > wrote:
> > >
> > > > > > > > > >
> > >
> > > > > > > > > > > Hello Everyone,
> > >
> > > > > > > > > > > I wanted to reach out to the community to discuss
> > >
> > > > > > > > > > > potentially enhancing
> > >
> > > > > > > > >
> > >
> > > > > > > > > > > NiFi to include predictive analytics that can help
> users
> > >
> > > > assess and
> > >
> > > > > > > > > > > predict
> > >
> > > > > > > > > > > NiFi behavior and performance. Currently NiFi has lots
> > >
> > > > > > > > > > > of
> > >
> > > > metrics
> > >
> > > > > > > > > > > available
> > >
> > > > > > > > > > > for areas including jvm and flow component usage (via
> > >
> > > > component
> > >
> > > > > > > > > > > status)
> > >
> > > > > > > > >
> > >
> > > > > > > > > > as
> > >
> > > > > > > > > >
> > >
> > > > > > > > > > > well as provenance data which NiFi makes available
> > >
> > > > > > > > > > > either
> > >
> > > > through
> > >
> > > > > > > > > > > the UI
> > >
> > > > > > > > >
> > >
> > > > > > > > > > or
> > >
> > > > > > > > > >
> > >
> > > > > > > > > > > reporting tasks (for consumption by other systems).
> Past
> > >
> > > > discussions
> > >
> > > > > > > > > > > in
> > >
> > > > > > > > >
> > >
> > > > > > > > > > the
> > >
> > > > > > > > > >
> > >
> > > > > > > > > > > community cite users shipping this data to applications
> > >
> > > > > > > > > > > such
> > >
> > > > as
> > >
> > > > > > > > > > > Prometheus,
> > >
> > > > > > > > > > > ELK stacks, or Ambari metrics for further analysis in
> > >
> > > > > > > > > > > order
> > >
> > > > to
> > >
> > > > > > > > > > > capture/review performance issues, detect anomalies,
> and
> > >
> > > > send alerts
> > >
> > > > > > > > > > > or
> > >
> > > > > > > > >
> > >
> > > > > > > > > > > notifications. These systems are efficient in capturing
> > >
> > > > > > > > > > > and
> > >
> > > > helping
> > >
> > > > > > > > > > > to
> > >
> > > > > > > > >
> > >
> > > > > > > > > > > analyze these metrics however it requires customization
> > >
> > > > > > > > > > > work
> > >
> > > > and
> > >
> > > > > > > > > > > knowledge
> > >
> > > > > > > > > > > of NiFi operations to provide meaningful analytics
> > >
> > > > > > > > > > > within a
> > >
> > > > flow
> > >
> > > > > > > > > > > context.
> > >
> > > > > > > > >
> > >
> > > > > > > > > > > In speaking with Matt Burgess and Andy Christianson on
> > >
> > > > > > > > > > > this
> > >
> > > > topic we
> > >
> > > > > > > > > > > feel
> > >
> > > > > > > > >
> > >
> > > > > > > > > > > that there is an opportunity to introduce an analytics
> > >
> > > > framework that
> > >
> > > > > > > > > > > could
> > >
> > > > > > > > > > > provide users reasonable predictions on key performance
> > >
> > > > indicators
> > >
> > > > > > > > > > > for
> > >
> > > > > > > > >
> > >
> > > > > > > > > > > flows, such as back pressure and flow rate, to help
> > >
> > > > administrators
> > >
> > > > > > > > > > > improve
> > >
> > > > > > > > > > > operational management of NiFi clusters. This framework
> > >
> > > > could offer
> > >
> > > > > > > > > > > several key features:
> > >
> > > > > > > > > > >
> > >
> > > > > > > > > > > -   Provide a flexible internal analytics engine and
> > model
> > >
> > > > api which
> > >
> > > > > > > > > > >     supports the addition of or enhancement to onboard
> > >
> > > > > > > > > > > models
> > >
> > > > > > > > > > >
> > >
> > > > > > > > > > > -   Support integration of remote or cloud based ML
> > models
> > >
> > > > > > > > > > > -   Support both traditional and online (incremental)
> > >
> > > > learning
> > >
> > > > > > > > > > >     methods
> > >
> > > > > > > > > > >
> > >
> > > > > > > > >
> > >
> > > > > > > > > > > -   Provide support for model caching (perhaps later
> > >
> > > > inclusion into
> > >
> > > > > > > > > > >     a
> > >
> > > > > > > > > > >
> > >
> > > > > > > > >
> > >
> > > > > > > > > > > model repository or registry)
> > >
> > > > > > > > > > >
> > >
> > > > > > > > > > > -   UI enhancements to display prediction information
> > > either
> > >
> > > > in
> > >
> > > > > > > > > > >     existing
> > >
> > > > > > > > > > >
> > >
> > > > > > > > >
> > >
> > > > > > > > > > > summary data, new data visualizations, or directly
> > >
> > > > > > > > > > > within the flow/canvas (where applicable) For an
> initial
> > >
> > > > > > > > > > > target we thought that back pressure
> > >
> > > > prediction would
> > >
> > > > > > > > > > > be a
> > >
> > > > > > > > >
> > >
> > > > > > > > > > > good starting point for this initiative, given that
> back
> > >
> > > > pressure
> > >
> > > > > > > > > > > detection
> > >
> > > > > > > > > > > is a key indicator of flow performance and many of the
> > >
> > > > metrics
> > >
> > > > > > > > > > > currently
> > >
> > > > > > > > >
> > >
> > > > > > > > > > > available would provide enough data points to create a
> > >
> > > > reasonable
> > >
> > > > > > > > > > > performing model. We have some ideas on how this could
> > >
> > > > > > > > > > > be
> > >
> > > > achieved
> > >
> > > > > > > > > > > however
> > >
> > > > > > > > > > > we wanted to discuss this more with the community to
> get
> > >
> > > > thoughts
> > >
> > > > > > > > > > > about
> > >
> > > > > > > > >
> > >
> > > > > > > > > > > tackling this work, especially if there are specific
> use
> > >
> > > > cases or
> > >
> > > > > > > > > > > other
> > >
> > > > > > > > >
> > >
> > > > > > > > > > > factors that should be considered.
> > >
> > > > > > > > > > > Looking forward to everyone's thoughts and input.
> > >
> > > > > > > > > > > Thanks,
> > >
> > > > > > > > > > > -yolanda
> > >
> > > > > > > > > > > --
> > >
> > > > > > > > > > > yolanda.m.davis@gmail.com<mailto:
> > yolanda.m.davis@gmail.com>
> > > @YolandaMDavis
> > >
> > > > > > > >
> > >
> > > > > > > > --
> > >
> > > > > > > > Regards
> > >
> > > > > > > > Craig Knell
> > >
> > > > > > > > Mobile: +61 402 128 615
> > >
> > > > > > > > Skype: craigknell
> > >
> > > >
> > >
> > > >
> > >
> > > >
> > >
> > >
> > >
> > > --
> > >
> > > --
> > >
> > > yolanda.m.davis@gmail.com<ma...@gmail.com>
> > >
> > > @YolandaMDavis
> > >
> >
> >
> > --
> > -------------------------------
> > Rob Fellows
> >
>
>
> --
> --
> yolanda.m.davis@gmail.com
> @YolandaMDavis
>


-- 
-------------------------------
Rob Fellows

Re: Re:[EXT] [DISCUSS] Predictive Analytics for NiFi Metrics

Posted by Yolanda Davis <yo...@gmail.com>.
Hi Mark and Rob

Mark thanks so much for the info on your work and Rob thanks for jumping in
on the UI! I just wanted to add, Mark, that looking at your branch I think
we also may have some opportunities to exchange notes or collaborate on the
backend as well.  The work in the feature branch is still in progress (with
some decoupling to ensure we can allow flexible configuration of models).
Please feel free to review and leave comments under the parent JIRA.  At
the same time I'll take a deeper dive on your branch and perhaps we can
exchange notes on potential areas for improvement/collaboration if it makes
sense?

Thanks Again,

-yolanda


On Mon, Aug 19, 2019 at 3:34 PM Robert Fellows <ro...@gmail.com>
wrote:

> Hey Mark,
>   I've started working on some UI based on the initial commit for this
> proposal. What you have done and what I am working on have a bit of
> overlap, but not much.
> I'm working on getting the predicted count and bytes into the existing
> connection metric display that is already on the canvas. The only overlap
> looks like it might be in the
> Summary table. I plan on adding a PR for my additions hopefully tomorrow.
> Maybe once it is up we can discuss how we bring the them together where it
> makes sense?
>
> This is the main JIRA case:
> https://issues.apache.org/jira/browse/NIFI-6510
> And this is the subtask that I am working toward:
> https://issues.apache.org/jira/browse/NIFI-6568
>
>
> -- Rob Fellows
>
> On Mon, Aug 19, 2019 at 2:26 PM Owens, Mark <jm...@evoforge.org> wrote:
>
> > The images from the preview email do not appear to be displaying. They
> can
> > be viewed at:
> > https://github.com/jmark99/nifi-images
> >
> > From: Owens, Mark <jm...@evoforge.org>
> > Sent: Monday, August 19, 2019 2:25 PM
> > To: dev@nifi.apache.org
> > Subject: RE: Re:[EXT] [DISCUSS] Predictive Analytics for NiFi Metrics
> >
> >
> > Hi Yolanda,
> >
> >
> >
> > I've been working on a feature that appears to possibly overlap with the
> > work you are pursuing. Perhaps we should see if/should we try to
> coordinate
> > our efforts. I've been updating NiFi to predict the time to queue
> overflow
> > for both flowfiles and bytes and displaying that information in the GUI.
> > For the initial attempt, I’ve been using a simple model of straight line
> > prediction over a sliding window of 15 minutes to predict when flows will
> > fail. This estimate is then displayed on both the NiFi Summary page under
> > the connections tab and in the status history graphs.  Below are examples
> > of what would be displayed to the user.
> >
> >
> >
> > [cid:image001.png@01D55696.E4CCD550]
> >
> >
> >
> > The Connection tab contains a new column on the right that displays the
> > prediction for both flow files and data size. The user can select a
> maximum
> > time at which specific times are no longer displayed. In this example, if
> > the prediction lies beyond 12 hours then the display simply indicates
> that
> > the flow is greater than 12 hours away from failure at the moment.
> >
> >
> >
> > [cid:image002.png@01D55697.2C8AC500]
> >
> >
> >
> > This display graphs the prediction for byte overflow over time. Note that
> > if the estimate is greater than the user provided maximum value of
> interest
> > the graph maxes out at that time, effectively indicating no overflow
> > concerns.
> >
> >
> >
> > [cid:image003.png@01D55697.965C27D0]
> >
> >
> >
> > A similar display for flowfile count is displayed as well.
> >
> >
> >
> > The current state of work can be found at
> > https://github.com/jmark99/nifi/tree/time-to-overflow
> >
> >
> >
> > I welcome your (or any others) feedback on this effort.
> >
> >
> >
> > Thanks,
> > Mark
> >
> >
> >
> > P.S. If the images are not displaying, they can be viewed at
> > https://github.com/jmark99/nifi-images
> >
> >
> >
> >
> >
> >
> >
> > -----Original Message-----
> > From: Yolanda Davis <yolanda.m.davis@gmail.com<mailto:
> > yolanda.m.davis@gmail.com>>
> > Sent: Monday, August 19, 2019 11:29 AM
> > To: dev@nifi.apache.org<ma...@nifi.apache.org>
> > Subject: Re:[EXT] [DISCUSS] Predictive Analytics for NiFi Metrics
> >
> >
> >
> > Hello All,
> >
> >
> >
> > I just wanted to follow up on the discussion we started a couple of weeks
> > ago concerning an analytics framework for NiFi metrics.  Working with
> Andy
> > Christianson and Matt Burgess we shaped our ideas and drafted a proposal
> > for this feature on the Apache NiFi Wiki [1] . We've also begun
> > implementing some of these ideas in a feature branch (which is work in
> >
> > progress) [2].  We’d appreciate any questions or feedback you may have.
> >
> >
> >
> > Thanks,
> >
> >
> >
> > -yolanda
> >
> >
> >
> > [1] -
> >
> >
> >
> https://cwiki.apache.org/confluence/display/NIFI/Operational+Analytics+Framework+for+NiFi
> >
> > [2] - https://github.com/apache/nifi/commits/analytics-framework
> >
> >
> >
> > On Wed, Jul 31, 2019 at 9:58 AM Andy Christianson <
> aichrist@protonmail.com
> > .invalid<ma...@protonmail.com.invalid>> wrote:
> >
> >
> >
> > > As someone who operated a 24/7 mission-critical NiFi flow, this
> >
> > > feature would have been a life saver. If I'm heading home on a Friday,
> >
> > > it would be great to have some blinking red lights to let me know that
> >
> > > the system predicts that it is going to experience backpressure
> >
> > > sometime over the weekend, so that corrective action could be taken
> > before leaving.
> >
> > >
> >
> > > Since there is support in the community for this, I created a JIRA to
> >
> > > track the effort:
> >
> > >
> >
> > > https://issues.apache.org/jira/browse/NIFI-6510
> >
> > >
> >
> > > I also created a JIRA to track the remote protocol:
> >
> > >
> >
> > > https://issues.apache.org/jira/browse/NIFI-6511
> >
> > >
> >
> > >
> >
> > > Regards,
> >
> > >
> >
> > > Andy
> >
> > >
> >
> > >
> >
> > > Sent from ProtonMail, Swiss-based encrypted email.
> >
> > >
> >
> > > ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> >
> > > On Wednesday, July 31, 2019 6:57 AM, Arpad Boda <aboda@apache.org
> > <ma...@apache.org>> wrote:
> >
> > >
> >
> > > > If you could share a bit more details about your OPC and Modbus
> >
> > > > usage,
> >
> > > that
> >
> > > > would be highly appreciated!
> >
> > > >
> >
> > > > On Wed, Jul 31, 2019 at 12:01 PM Craig Knell craig.knell@gmail.com
> > <ma...@gmail.com>
> >
> > > wrote:
> >
> > > >
> >
> > > > > Sounds. Great
> >
> > > > > Let me know if you need some help
> >
> > > > > Best regards
> >
> > > > > Craig
> >
> > > > >
> >
> > > > > > On 31 Jul 2019, at 17:31, Arpad Boda aboda@cloudera.com.invalid
> > <ma...@cloudera.com.invalid>
> >
> > > wrote:
> >
> > > > > > Craig,
> >
> > > > > > OPC ( https://issues.apache.org/jira/browse/MINIFICPP-819 ) and
> >
> > > Modbus (
> >
> > > > > > https://issues.apache.org/jira/browse/MINIFICPP-897 ) are on the
> >
> > > way for
> >
> > > > > > MiNiFi c++, hopefully both will be part of next release (0.7.0).
> >
> > > > > > It's gonna be legen... wait for it! :) Regards, Arpad
> >
> > > > > >
> >
> > > > > > > On Wed, Jul 31, 2019 at 2:30 AM Craig Knell
> >
> > > > > > > craig.knell@gmail.com<ma...@gmail.com>
> >
> > > > > > > wrote:
> >
> > > > > >
> >
> > > > > > > Hi Folks
> >
> > > > > > > That's our use case now. All our Models are run in python.
> >
> > > > > > > Currently we send events to the ML via http, although this is
> >
> > > > > > > not optimal
> >
> > > > > >
> >
> > > > > > > Our use case is edge ML where we want a light weight wrapper
> >
> > > > > > > for Python code base.
> >
> > > > > > > Jython however does not work with the code base I'm think of
> >
> > > > > > > changing the interface to some thing like REDIS for
> >
> > > pub/sub
> >
> > > > > > > Id also like this to be a push deployment via minifi Also
> >
> > > > > > > support for sensors via protocols via Modbus and OPC would be
> >
> > > great
> >
> > > > > > > Craig
> >
> > > > > > >
> >
> > > > > > > > On Wed, Jul 31, 2019 at 1:43 AM Joe Witt joe.witt@gmail.com
> > <ma...@gmail.com>
> >
> > > wrote:
> >
> > > > > > > > Definitely something that I think would really help the
> >
> > > community. It
> >
> > > > > > > > might make sense to frame/structure these APIs such that an
> >
> > > internal
> >
> > > > > > > > option
> >
> > > > > > > > could be available to reduce dependencies and get up and
> >
> > > > > > > > running
> >
> > > but
> >
> > > > > > > > that
> >
> > > > > >
> >
> > > > > > > > also just as easily a remote implementation where the engine
> >
> > > lives and
> >
> > > > > > > > is
> >
> > > > > >
> >
> > > > > > > > managed externally could also be supported.
> >
> > > > > > > > Thanks
> >
> > > > > > > > On Tue, Jul 30, 2019 at 1:40 PM Andy LoPresto
> >
> > > alopresto@apache.org<ma...@apache.org>
> >
> > > > > > > > wrote:
> >
> > > > > > > >
> >
> > > > > > > > > Yolanda,
> >
> > > > > > > > > I think this sounds like a great idea and will be very
> >
> > > > > > > > > useful
> >
> > > to
> >
> > > > > > > > > admins/users, as well as enabling some interesting
> >
> > > > > > > > > next-level functionality
> >
> > > > > > > >
> >
> > > > > > > > > and insight generation. Thanks for putting this out there.
> >
> > > > > > > > > Andy LoPresto
> >
> > > > > > > > > alopresto@apache.org<ma...@apache.org>
> >
> > > > > > > > > alopresto.apache@gmail.com<mailto:
> alopresto.apache@gmail.com>
> > PGP Fingerprint: 70EC B3E5 98A6
> >
> > > > > > > > > 5A3F D3C4 BACE 3C6E F65B 2F7D
> >
> > > EF69
> >
> > > > > > > > >
> >
> > > > > > > > > > On Jul 30, 2019, at 5:55 AM, Yolanda Davis <
> >
> > > > > > > > > > yolanda.m.davis@gmail.com<mailto:
> yolanda.m.davis@gmail.com
> > >>
> >
> > > > > > > >
> >
> > > > > > > > > wrote:
> >
> > > > > > > > >
> >
> > > > > > > > > > Hello Everyone,
> >
> > > > > > > > > > I wanted to reach out to the community to discuss
> >
> > > > > > > > > > potentially enhancing
> >
> > > > > > > >
> >
> > > > > > > > > > NiFi to include predictive analytics that can help users
> >
> > > assess and
> >
> > > > > > > > > > predict
> >
> > > > > > > > > > NiFi behavior and performance. Currently NiFi has lots
> >
> > > > > > > > > > of
> >
> > > metrics
> >
> > > > > > > > > > available
> >
> > > > > > > > > > for areas including jvm and flow component usage (via
> >
> > > component
> >
> > > > > > > > > > status)
> >
> > > > > > > >
> >
> > > > > > > > > as
> >
> > > > > > > > >
> >
> > > > > > > > > > well as provenance data which NiFi makes available
> >
> > > > > > > > > > either
> >
> > > through
> >
> > > > > > > > > > the UI
> >
> > > > > > > >
> >
> > > > > > > > > or
> >
> > > > > > > > >
> >
> > > > > > > > > > reporting tasks (for consumption by other systems). Past
> >
> > > discussions
> >
> > > > > > > > > > in
> >
> > > > > > > >
> >
> > > > > > > > > the
> >
> > > > > > > > >
> >
> > > > > > > > > > community cite users shipping this data to applications
> >
> > > > > > > > > > such
> >
> > > as
> >
> > > > > > > > > > Prometheus,
> >
> > > > > > > > > > ELK stacks, or Ambari metrics for further analysis in
> >
> > > > > > > > > > order
> >
> > > to
> >
> > > > > > > > > > capture/review performance issues, detect anomalies, and
> >
> > > send alerts
> >
> > > > > > > > > > or
> >
> > > > > > > >
> >
> > > > > > > > > > notifications. These systems are efficient in capturing
> >
> > > > > > > > > > and
> >
> > > helping
> >
> > > > > > > > > > to
> >
> > > > > > > >
> >
> > > > > > > > > > analyze these metrics however it requires customization
> >
> > > > > > > > > > work
> >
> > > and
> >
> > > > > > > > > > knowledge
> >
> > > > > > > > > > of NiFi operations to provide meaningful analytics
> >
> > > > > > > > > > within a
> >
> > > flow
> >
> > > > > > > > > > context.
> >
> > > > > > > >
> >
> > > > > > > > > > In speaking with Matt Burgess and Andy Christianson on
> >
> > > > > > > > > > this
> >
> > > topic we
> >
> > > > > > > > > > feel
> >
> > > > > > > >
> >
> > > > > > > > > > that there is an opportunity to introduce an analytics
> >
> > > framework that
> >
> > > > > > > > > > could
> >
> > > > > > > > > > provide users reasonable predictions on key performance
> >
> > > indicators
> >
> > > > > > > > > > for
> >
> > > > > > > >
> >
> > > > > > > > > > flows, such as back pressure and flow rate, to help
> >
> > > administrators
> >
> > > > > > > > > > improve
> >
> > > > > > > > > > operational management of NiFi clusters. This framework
> >
> > > could offer
> >
> > > > > > > > > > several key features:
> >
> > > > > > > > > >
> >
> > > > > > > > > > -   Provide a flexible internal analytics engine and
> model
> >
> > > api which
> >
> > > > > > > > > >     supports the addition of or enhancement to onboard
> >
> > > > > > > > > > models
> >
> > > > > > > > > >
> >
> > > > > > > > > > -   Support integration of remote or cloud based ML
> models
> >
> > > > > > > > > > -   Support both traditional and online (incremental)
> >
> > > learning
> >
> > > > > > > > > >     methods
> >
> > > > > > > > > >
> >
> > > > > > > >
> >
> > > > > > > > > > -   Provide support for model caching (perhaps later
> >
> > > inclusion into
> >
> > > > > > > > > >     a
> >
> > > > > > > > > >
> >
> > > > > > > >
> >
> > > > > > > > > > model repository or registry)
> >
> > > > > > > > > >
> >
> > > > > > > > > > -   UI enhancements to display prediction information
> > either
> >
> > > in
> >
> > > > > > > > > >     existing
> >
> > > > > > > > > >
> >
> > > > > > > >
> >
> > > > > > > > > > summary data, new data visualizations, or directly
> >
> > > > > > > > > > within the flow/canvas (where applicable) For an initial
> >
> > > > > > > > > > target we thought that back pressure
> >
> > > prediction would
> >
> > > > > > > > > > be a
> >
> > > > > > > >
> >
> > > > > > > > > > good starting point for this initiative, given that back
> >
> > > pressure
> >
> > > > > > > > > > detection
> >
> > > > > > > > > > is a key indicator of flow performance and many of the
> >
> > > metrics
> >
> > > > > > > > > > currently
> >
> > > > > > > >
> >
> > > > > > > > > > available would provide enough data points to create a
> >
> > > reasonable
> >
> > > > > > > > > > performing model. We have some ideas on how this could
> >
> > > > > > > > > > be
> >
> > > achieved
> >
> > > > > > > > > > however
> >
> > > > > > > > > > we wanted to discuss this more with the community to get
> >
> > > thoughts
> >
> > > > > > > > > > about
> >
> > > > > > > >
> >
> > > > > > > > > > tackling this work, especially if there are specific use
> >
> > > cases or
> >
> > > > > > > > > > other
> >
> > > > > > > >
> >
> > > > > > > > > > factors that should be considered.
> >
> > > > > > > > > > Looking forward to everyone's thoughts and input.
> >
> > > > > > > > > > Thanks,
> >
> > > > > > > > > > -yolanda
> >
> > > > > > > > > > --
> >
> > > > > > > > > > yolanda.m.davis@gmail.com<mailto:
> yolanda.m.davis@gmail.com>
> > @YolandaMDavis
> >
> > > > > > >
> >
> > > > > > > --
> >
> > > > > > > Regards
> >
> > > > > > > Craig Knell
> >
> > > > > > > Mobile: +61 402 128 615
> >
> > > > > > > Skype: craigknell
> >
> > >
> >
> > >
> >
> > >
> >
> >
> >
> > --
> >
> > --
> >
> > yolanda.m.davis@gmail.com<ma...@gmail.com>
> >
> > @YolandaMDavis
> >
>
>
> --
> -------------------------------
> Rob Fellows
>


-- 
--
yolanda.m.davis@gmail.com
@YolandaMDavis

Re: Re:[EXT] [DISCUSS] Predictive Analytics for NiFi Metrics

Posted by Robert Fellows <ro...@gmail.com>.
Hey Mark,
  I've started working on some UI based on the initial commit for this
proposal. What you have done and what I am working on have a bit of
overlap, but not much.
I'm working on getting the predicted count and bytes into the existing
connection metric display that is already on the canvas. The only overlap
looks like it might be in the
Summary table. I plan on adding a PR for my additions hopefully tomorrow.
Maybe once it is up we can discuss how we bring the them together where it
makes sense?

This is the main JIRA case: https://issues.apache.org/jira/browse/NIFI-6510
And this is the subtask that I am working toward:
https://issues.apache.org/jira/browse/NIFI-6568


-- Rob Fellows

On Mon, Aug 19, 2019 at 2:26 PM Owens, Mark <jm...@evoforge.org> wrote:

> The images from the preview email do not appear to be displaying. They can
> be viewed at:
> https://github.com/jmark99/nifi-images
>
> From: Owens, Mark <jm...@evoforge.org>
> Sent: Monday, August 19, 2019 2:25 PM
> To: dev@nifi.apache.org
> Subject: RE: Re:[EXT] [DISCUSS] Predictive Analytics for NiFi Metrics
>
>
> Hi Yolanda,
>
>
>
> I've been working on a feature that appears to possibly overlap with the
> work you are pursuing. Perhaps we should see if/should we try to coordinate
> our efforts. I've been updating NiFi to predict the time to queue overflow
> for both flowfiles and bytes and displaying that information in the GUI.
> For the initial attempt, I’ve been using a simple model of straight line
> prediction over a sliding window of 15 minutes to predict when flows will
> fail. This estimate is then displayed on both the NiFi Summary page under
> the connections tab and in the status history graphs.  Below are examples
> of what would be displayed to the user.
>
>
>
> [cid:image001.png@01D55696.E4CCD550]
>
>
>
> The Connection tab contains a new column on the right that displays the
> prediction for both flow files and data size. The user can select a maximum
> time at which specific times are no longer displayed. In this example, if
> the prediction lies beyond 12 hours then the display simply indicates that
> the flow is greater than 12 hours away from failure at the moment.
>
>
>
> [cid:image002.png@01D55697.2C8AC500]
>
>
>
> This display graphs the prediction for byte overflow over time. Note that
> if the estimate is greater than the user provided maximum value of interest
> the graph maxes out at that time, effectively indicating no overflow
> concerns.
>
>
>
> [cid:image003.png@01D55697.965C27D0]
>
>
>
> A similar display for flowfile count is displayed as well.
>
>
>
> The current state of work can be found at
> https://github.com/jmark99/nifi/tree/time-to-overflow
>
>
>
> I welcome your (or any others) feedback on this effort.
>
>
>
> Thanks,
> Mark
>
>
>
> P.S. If the images are not displaying, they can be viewed at
> https://github.com/jmark99/nifi-images
>
>
>
>
>
>
>
> -----Original Message-----
> From: Yolanda Davis <yolanda.m.davis@gmail.com<mailto:
> yolanda.m.davis@gmail.com>>
> Sent: Monday, August 19, 2019 11:29 AM
> To: dev@nifi.apache.org<ma...@nifi.apache.org>
> Subject: Re:[EXT] [DISCUSS] Predictive Analytics for NiFi Metrics
>
>
>
> Hello All,
>
>
>
> I just wanted to follow up on the discussion we started a couple of weeks
> ago concerning an analytics framework for NiFi metrics.  Working with Andy
> Christianson and Matt Burgess we shaped our ideas and drafted a proposal
> for this feature on the Apache NiFi Wiki [1] . We've also begun
> implementing some of these ideas in a feature branch (which is work in
>
> progress) [2].  We’d appreciate any questions or feedback you may have.
>
>
>
> Thanks,
>
>
>
> -yolanda
>
>
>
> [1] -
>
>
> https://cwiki.apache.org/confluence/display/NIFI/Operational+Analytics+Framework+for+NiFi
>
> [2] - https://github.com/apache/nifi/commits/analytics-framework
>
>
>
> On Wed, Jul 31, 2019 at 9:58 AM Andy Christianson <aichrist@protonmail.com
> .invalid<ma...@protonmail.com.invalid>> wrote:
>
>
>
> > As someone who operated a 24/7 mission-critical NiFi flow, this
>
> > feature would have been a life saver. If I'm heading home on a Friday,
>
> > it would be great to have some blinking red lights to let me know that
>
> > the system predicts that it is going to experience backpressure
>
> > sometime over the weekend, so that corrective action could be taken
> before leaving.
>
> >
>
> > Since there is support in the community for this, I created a JIRA to
>
> > track the effort:
>
> >
>
> > https://issues.apache.org/jira/browse/NIFI-6510
>
> >
>
> > I also created a JIRA to track the remote protocol:
>
> >
>
> > https://issues.apache.org/jira/browse/NIFI-6511
>
> >
>
> >
>
> > Regards,
>
> >
>
> > Andy
>
> >
>
> >
>
> > Sent from ProtonMail, Swiss-based encrypted email.
>
> >
>
> > ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
>
> > On Wednesday, July 31, 2019 6:57 AM, Arpad Boda <aboda@apache.org
> <ma...@apache.org>> wrote:
>
> >
>
> > > If you could share a bit more details about your OPC and Modbus
>
> > > usage,
>
> > that
>
> > > would be highly appreciated!
>
> > >
>
> > > On Wed, Jul 31, 2019 at 12:01 PM Craig Knell craig.knell@gmail.com
> <ma...@gmail.com>
>
> > wrote:
>
> > >
>
> > > > Sounds. Great
>
> > > > Let me know if you need some help
>
> > > > Best regards
>
> > > > Craig
>
> > > >
>
> > > > > On 31 Jul 2019, at 17:31, Arpad Boda aboda@cloudera.com.invalid
> <ma...@cloudera.com.invalid>
>
> > wrote:
>
> > > > > Craig,
>
> > > > > OPC ( https://issues.apache.org/jira/browse/MINIFICPP-819 ) and
>
> > Modbus (
>
> > > > > https://issues.apache.org/jira/browse/MINIFICPP-897 ) are on the
>
> > way for
>
> > > > > MiNiFi c++, hopefully both will be part of next release (0.7.0).
>
> > > > > It's gonna be legen... wait for it! :) Regards, Arpad
>
> > > > >
>
> > > > > > On Wed, Jul 31, 2019 at 2:30 AM Craig Knell
>
> > > > > > craig.knell@gmail.com<ma...@gmail.com>
>
> > > > > > wrote:
>
> > > > >
>
> > > > > > Hi Folks
>
> > > > > > That's our use case now. All our Models are run in python.
>
> > > > > > Currently we send events to the ML via http, although this is
>
> > > > > > not optimal
>
> > > > >
>
> > > > > > Our use case is edge ML where we want a light weight wrapper
>
> > > > > > for Python code base.
>
> > > > > > Jython however does not work with the code base I'm think of
>
> > > > > > changing the interface to some thing like REDIS for
>
> > pub/sub
>
> > > > > > Id also like this to be a push deployment via minifi Also
>
> > > > > > support for sensors via protocols via Modbus and OPC would be
>
> > great
>
> > > > > > Craig
>
> > > > > >
>
> > > > > > > On Wed, Jul 31, 2019 at 1:43 AM Joe Witt joe.witt@gmail.com
> <ma...@gmail.com>
>
> > wrote:
>
> > > > > > > Definitely something that I think would really help the
>
> > community. It
>
> > > > > > > might make sense to frame/structure these APIs such that an
>
> > internal
>
> > > > > > > option
>
> > > > > > > could be available to reduce dependencies and get up and
>
> > > > > > > running
>
> > but
>
> > > > > > > that
>
> > > > >
>
> > > > > > > also just as easily a remote implementation where the engine
>
> > lives and
>
> > > > > > > is
>
> > > > >
>
> > > > > > > managed externally could also be supported.
>
> > > > > > > Thanks
>
> > > > > > > On Tue, Jul 30, 2019 at 1:40 PM Andy LoPresto
>
> > alopresto@apache.org<ma...@apache.org>
>
> > > > > > > wrote:
>
> > > > > > >
>
> > > > > > > > Yolanda,
>
> > > > > > > > I think this sounds like a great idea and will be very
>
> > > > > > > > useful
>
> > to
>
> > > > > > > > admins/users, as well as enabling some interesting
>
> > > > > > > > next-level functionality
>
> > > > > > >
>
> > > > > > > > and insight generation. Thanks for putting this out there.
>
> > > > > > > > Andy LoPresto
>
> > > > > > > > alopresto@apache.org<ma...@apache.org>
>
> > > > > > > > alopresto.apache@gmail.com<ma...@gmail.com>
> PGP Fingerprint: 70EC B3E5 98A6
>
> > > > > > > > 5A3F D3C4 BACE 3C6E F65B 2F7D
>
> > EF69
>
> > > > > > > >
>
> > > > > > > > > On Jul 30, 2019, at 5:55 AM, Yolanda Davis <
>
> > > > > > > > > yolanda.m.davis@gmail.com<mailto:yolanda.m.davis@gmail.com
> >>
>
> > > > > > >
>
> > > > > > > > wrote:
>
> > > > > > > >
>
> > > > > > > > > Hello Everyone,
>
> > > > > > > > > I wanted to reach out to the community to discuss
>
> > > > > > > > > potentially enhancing
>
> > > > > > >
>
> > > > > > > > > NiFi to include predictive analytics that can help users
>
> > assess and
>
> > > > > > > > > predict
>
> > > > > > > > > NiFi behavior and performance. Currently NiFi has lots
>
> > > > > > > > > of
>
> > metrics
>
> > > > > > > > > available
>
> > > > > > > > > for areas including jvm and flow component usage (via
>
> > component
>
> > > > > > > > > status)
>
> > > > > > >
>
> > > > > > > > as
>
> > > > > > > >
>
> > > > > > > > > well as provenance data which NiFi makes available
>
> > > > > > > > > either
>
> > through
>
> > > > > > > > > the UI
>
> > > > > > >
>
> > > > > > > > or
>
> > > > > > > >
>
> > > > > > > > > reporting tasks (for consumption by other systems). Past
>
> > discussions
>
> > > > > > > > > in
>
> > > > > > >
>
> > > > > > > > the
>
> > > > > > > >
>
> > > > > > > > > community cite users shipping this data to applications
>
> > > > > > > > > such
>
> > as
>
> > > > > > > > > Prometheus,
>
> > > > > > > > > ELK stacks, or Ambari metrics for further analysis in
>
> > > > > > > > > order
>
> > to
>
> > > > > > > > > capture/review performance issues, detect anomalies, and
>
> > send alerts
>
> > > > > > > > > or
>
> > > > > > >
>
> > > > > > > > > notifications. These systems are efficient in capturing
>
> > > > > > > > > and
>
> > helping
>
> > > > > > > > > to
>
> > > > > > >
>
> > > > > > > > > analyze these metrics however it requires customization
>
> > > > > > > > > work
>
> > and
>
> > > > > > > > > knowledge
>
> > > > > > > > > of NiFi operations to provide meaningful analytics
>
> > > > > > > > > within a
>
> > flow
>
> > > > > > > > > context.
>
> > > > > > >
>
> > > > > > > > > In speaking with Matt Burgess and Andy Christianson on
>
> > > > > > > > > this
>
> > topic we
>
> > > > > > > > > feel
>
> > > > > > >
>
> > > > > > > > > that there is an opportunity to introduce an analytics
>
> > framework that
>
> > > > > > > > > could
>
> > > > > > > > > provide users reasonable predictions on key performance
>
> > indicators
>
> > > > > > > > > for
>
> > > > > > >
>
> > > > > > > > > flows, such as back pressure and flow rate, to help
>
> > administrators
>
> > > > > > > > > improve
>
> > > > > > > > > operational management of NiFi clusters. This framework
>
> > could offer
>
> > > > > > > > > several key features:
>
> > > > > > > > >
>
> > > > > > > > > -   Provide a flexible internal analytics engine and model
>
> > api which
>
> > > > > > > > >     supports the addition of or enhancement to onboard
>
> > > > > > > > > models
>
> > > > > > > > >
>
> > > > > > > > > -   Support integration of remote or cloud based ML models
>
> > > > > > > > > -   Support both traditional and online (incremental)
>
> > learning
>
> > > > > > > > >     methods
>
> > > > > > > > >
>
> > > > > > >
>
> > > > > > > > > -   Provide support for model caching (perhaps later
>
> > inclusion into
>
> > > > > > > > >     a
>
> > > > > > > > >
>
> > > > > > >
>
> > > > > > > > > model repository or registry)
>
> > > > > > > > >
>
> > > > > > > > > -   UI enhancements to display prediction information
> either
>
> > in
>
> > > > > > > > >     existing
>
> > > > > > > > >
>
> > > > > > >
>
> > > > > > > > > summary data, new data visualizations, or directly
>
> > > > > > > > > within the flow/canvas (where applicable) For an initial
>
> > > > > > > > > target we thought that back pressure
>
> > prediction would
>
> > > > > > > > > be a
>
> > > > > > >
>
> > > > > > > > > good starting point for this initiative, given that back
>
> > pressure
>
> > > > > > > > > detection
>
> > > > > > > > > is a key indicator of flow performance and many of the
>
> > metrics
>
> > > > > > > > > currently
>
> > > > > > >
>
> > > > > > > > > available would provide enough data points to create a
>
> > reasonable
>
> > > > > > > > > performing model. We have some ideas on how this could
>
> > > > > > > > > be
>
> > achieved
>
> > > > > > > > > however
>
> > > > > > > > > we wanted to discuss this more with the community to get
>
> > thoughts
>
> > > > > > > > > about
>
> > > > > > >
>
> > > > > > > > > tackling this work, especially if there are specific use
>
> > cases or
>
> > > > > > > > > other
>
> > > > > > >
>
> > > > > > > > > factors that should be considered.
>
> > > > > > > > > Looking forward to everyone's thoughts and input.
>
> > > > > > > > > Thanks,
>
> > > > > > > > > -yolanda
>
> > > > > > > > > --
>
> > > > > > > > > yolanda.m.davis@gmail.com<ma...@gmail.com>
> @YolandaMDavis
>
> > > > > >
>
> > > > > > --
>
> > > > > > Regards
>
> > > > > > Craig Knell
>
> > > > > > Mobile: +61 402 128 615
>
> > > > > > Skype: craigknell
>
> >
>
> >
>
> >
>
>
>
> --
>
> --
>
> yolanda.m.davis@gmail.com<ma...@gmail.com>
>
> @YolandaMDavis
>


-- 
-------------------------------
Rob Fellows

RE: Re:[EXT] [DISCUSS] Predictive Analytics for NiFi Metrics

Posted by "Owens, Mark" <jm...@evoforge.org>.
The images from the preview email do not appear to be displaying. They can be viewed at:
https://github.com/jmark99/nifi-images

From: Owens, Mark <jm...@evoforge.org>
Sent: Monday, August 19, 2019 2:25 PM
To: dev@nifi.apache.org
Subject: RE: Re:[EXT] [DISCUSS] Predictive Analytics for NiFi Metrics


Hi Yolanda,



I've been working on a feature that appears to possibly overlap with the work you are pursuing. Perhaps we should see if/should we try to coordinate our efforts. I've been updating NiFi to predict the time to queue overflow for both flowfiles and bytes and displaying that information in the GUI. For the initial attempt, I’ve been using a simple model of straight line prediction over a sliding window of 15 minutes to predict when flows will fail. This estimate is then displayed on both the NiFi Summary page under the connections tab and in the status history graphs.  Below are examples of what would be displayed to the user.



[cid:image001.png@01D55696.E4CCD550]



The Connection tab contains a new column on the right that displays the prediction for both flow files and data size. The user can select a maximum time at which specific times are no longer displayed. In this example, if the prediction lies beyond 12 hours then the display simply indicates that the flow is greater than 12 hours away from failure at the moment.



[cid:image002.png@01D55697.2C8AC500]



This display graphs the prediction for byte overflow over time. Note that if the estimate is greater than the user provided maximum value of interest the graph maxes out at that time, effectively indicating no overflow concerns.



[cid:image003.png@01D55697.965C27D0]



A similar display for flowfile count is displayed as well.



The current state of work can be found at https://github.com/jmark99/nifi/tree/time-to-overflow



I welcome your (or any others) feedback on this effort.



Thanks,
Mark



P.S. If the images are not displaying, they can be viewed at https://github.com/jmark99/nifi-images







-----Original Message-----
From: Yolanda Davis <yo...@gmail.com>>
Sent: Monday, August 19, 2019 11:29 AM
To: dev@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re:[EXT] [DISCUSS] Predictive Analytics for NiFi Metrics



Hello All,



I just wanted to follow up on the discussion we started a couple of weeks ago concerning an analytics framework for NiFi metrics.  Working with Andy Christianson and Matt Burgess we shaped our ideas and drafted a proposal for this feature on the Apache NiFi Wiki [1] . We've also begun implementing some of these ideas in a feature branch (which is work in

progress) [2].  We’d appreciate any questions or feedback you may have.



Thanks,



-yolanda



[1] -

https://cwiki.apache.org/confluence/display/NIFI/Operational+Analytics+Framework+for+NiFi

[2] - https://github.com/apache/nifi/commits/analytics-framework



On Wed, Jul 31, 2019 at 9:58 AM Andy Christianson <ai...@protonmail.com.invalid>> wrote:



> As someone who operated a 24/7 mission-critical NiFi flow, this

> feature would have been a life saver. If I'm heading home on a Friday,

> it would be great to have some blinking red lights to let me know that

> the system predicts that it is going to experience backpressure

> sometime over the weekend, so that corrective action could be taken before leaving.

>

> Since there is support in the community for this, I created a JIRA to

> track the effort:

>

> https://issues.apache.org/jira/browse/NIFI-6510

>

> I also created a JIRA to track the remote protocol:

>

> https://issues.apache.org/jira/browse/NIFI-6511

>

>

> Regards,

>

> Andy

>

>

> Sent from ProtonMail, Swiss-based encrypted email.

>

> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐

> On Wednesday, July 31, 2019 6:57 AM, Arpad Boda <ab...@apache.org>> wrote:

>

> > If you could share a bit more details about your OPC and Modbus

> > usage,

> that

> > would be highly appreciated!

> >

> > On Wed, Jul 31, 2019 at 12:01 PM Craig Knell craig.knell@gmail.com<ma...@gmail.com>

> wrote:

> >

> > > Sounds. Great

> > > Let me know if you need some help

> > > Best regards

> > > Craig

> > >

> > > > On 31 Jul 2019, at 17:31, Arpad Boda aboda@cloudera.com.invalid<ma...@cloudera.com.invalid>

> wrote:

> > > > Craig,

> > > > OPC ( https://issues.apache.org/jira/browse/MINIFICPP-819 ) and

> Modbus (

> > > > https://issues.apache.org/jira/browse/MINIFICPP-897 ) are on the

> way for

> > > > MiNiFi c++, hopefully both will be part of next release (0.7.0).

> > > > It's gonna be legen... wait for it! :) Regards, Arpad

> > > >

> > > > > On Wed, Jul 31, 2019 at 2:30 AM Craig Knell

> > > > > craig.knell@gmail.com<ma...@gmail.com>

> > > > > wrote:

> > > >

> > > > > Hi Folks

> > > > > That's our use case now. All our Models are run in python.

> > > > > Currently we send events to the ML via http, although this is

> > > > > not optimal

> > > >

> > > > > Our use case is edge ML where we want a light weight wrapper

> > > > > for Python code base.

> > > > > Jython however does not work with the code base I'm think of

> > > > > changing the interface to some thing like REDIS for

> pub/sub

> > > > > Id also like this to be a push deployment via minifi Also

> > > > > support for sensors via protocols via Modbus and OPC would be

> great

> > > > > Craig

> > > > >

> > > > > > On Wed, Jul 31, 2019 at 1:43 AM Joe Witt joe.witt@gmail.com<ma...@gmail.com>

> wrote:

> > > > > > Definitely something that I think would really help the

> community. It

> > > > > > might make sense to frame/structure these APIs such that an

> internal

> > > > > > option

> > > > > > could be available to reduce dependencies and get up and

> > > > > > running

> but

> > > > > > that

> > > >

> > > > > > also just as easily a remote implementation where the engine

> lives and

> > > > > > is

> > > >

> > > > > > managed externally could also be supported.

> > > > > > Thanks

> > > > > > On Tue, Jul 30, 2019 at 1:40 PM Andy LoPresto

> alopresto@apache.org<ma...@apache.org>

> > > > > > wrote:

> > > > > >

> > > > > > > Yolanda,

> > > > > > > I think this sounds like a great idea and will be very

> > > > > > > useful

> to

> > > > > > > admins/users, as well as enabling some interesting

> > > > > > > next-level functionality

> > > > > >

> > > > > > > and insight generation. Thanks for putting this out there.

> > > > > > > Andy LoPresto

> > > > > > > alopresto@apache.org<ma...@apache.org>

> > > > > > > alopresto.apache@gmail.com<ma...@gmail.com> PGP Fingerprint: 70EC B3E5 98A6

> > > > > > > 5A3F D3C4 BACE 3C6E F65B 2F7D

> EF69

> > > > > > >

> > > > > > > > On Jul 30, 2019, at 5:55 AM, Yolanda Davis <

> > > > > > > > yolanda.m.davis@gmail.com<ma...@gmail.com>>

> > > > > >

> > > > > > > wrote:

> > > > > > >

> > > > > > > > Hello Everyone,

> > > > > > > > I wanted to reach out to the community to discuss

> > > > > > > > potentially enhancing

> > > > > >

> > > > > > > > NiFi to include predictive analytics that can help users

> assess and

> > > > > > > > predict

> > > > > > > > NiFi behavior and performance. Currently NiFi has lots

> > > > > > > > of

> metrics

> > > > > > > > available

> > > > > > > > for areas including jvm and flow component usage (via

> component

> > > > > > > > status)

> > > > > >

> > > > > > > as

> > > > > > >

> > > > > > > > well as provenance data which NiFi makes available

> > > > > > > > either

> through

> > > > > > > > the UI

> > > > > >

> > > > > > > or

> > > > > > >

> > > > > > > > reporting tasks (for consumption by other systems). Past

> discussions

> > > > > > > > in

> > > > > >

> > > > > > > the

> > > > > > >

> > > > > > > > community cite users shipping this data to applications

> > > > > > > > such

> as

> > > > > > > > Prometheus,

> > > > > > > > ELK stacks, or Ambari metrics for further analysis in

> > > > > > > > order

> to

> > > > > > > > capture/review performance issues, detect anomalies, and

> send alerts

> > > > > > > > or

> > > > > >

> > > > > > > > notifications. These systems are efficient in capturing

> > > > > > > > and

> helping

> > > > > > > > to

> > > > > >

> > > > > > > > analyze these metrics however it requires customization

> > > > > > > > work

> and

> > > > > > > > knowledge

> > > > > > > > of NiFi operations to provide meaningful analytics

> > > > > > > > within a

> flow

> > > > > > > > context.

> > > > > >

> > > > > > > > In speaking with Matt Burgess and Andy Christianson on

> > > > > > > > this

> topic we

> > > > > > > > feel

> > > > > >

> > > > > > > > that there is an opportunity to introduce an analytics

> framework that

> > > > > > > > could

> > > > > > > > provide users reasonable predictions on key performance

> indicators

> > > > > > > > for

> > > > > >

> > > > > > > > flows, such as back pressure and flow rate, to help

> administrators

> > > > > > > > improve

> > > > > > > > operational management of NiFi clusters. This framework

> could offer

> > > > > > > > several key features:

> > > > > > > >

> > > > > > > > -   Provide a flexible internal analytics engine and model

> api which

> > > > > > > >     supports the addition of or enhancement to onboard

> > > > > > > > models

> > > > > > > >

> > > > > > > > -   Support integration of remote or cloud based ML models

> > > > > > > > -   Support both traditional and online (incremental)

> learning

> > > > > > > >     methods

> > > > > > > >

> > > > > >

> > > > > > > > -   Provide support for model caching (perhaps later

> inclusion into

> > > > > > > >     a

> > > > > > > >

> > > > > >

> > > > > > > > model repository or registry)

> > > > > > > >

> > > > > > > > -   UI enhancements to display prediction information either

> in

> > > > > > > >     existing

> > > > > > > >

> > > > > >

> > > > > > > > summary data, new data visualizations, or directly

> > > > > > > > within the flow/canvas (where applicable) For an initial

> > > > > > > > target we thought that back pressure

> prediction would

> > > > > > > > be a

> > > > > >

> > > > > > > > good starting point for this initiative, given that back

> pressure

> > > > > > > > detection

> > > > > > > > is a key indicator of flow performance and many of the

> metrics

> > > > > > > > currently

> > > > > >

> > > > > > > > available would provide enough data points to create a

> reasonable

> > > > > > > > performing model. We have some ideas on how this could

> > > > > > > > be

> achieved

> > > > > > > > however

> > > > > > > > we wanted to discuss this more with the community to get

> thoughts

> > > > > > > > about

> > > > > >

> > > > > > > > tackling this work, especially if there are specific use

> cases or

> > > > > > > > other

> > > > > >

> > > > > > > > factors that should be considered.

> > > > > > > > Looking forward to everyone's thoughts and input.

> > > > > > > > Thanks,

> > > > > > > > -yolanda

> > > > > > > > --

> > > > > > > > yolanda.m.davis@gmail.com<ma...@gmail.com> @YolandaMDavis

> > > > >

> > > > > --

> > > > > Regards

> > > > > Craig Knell

> > > > > Mobile: +61 402 128 615

> > > > > Skype: craigknell

>

>

>



--

--

yolanda.m.davis@gmail.com<ma...@gmail.com>

@YolandaMDavis

RE: Re:[EXT] [DISCUSS] Predictive Analytics for NiFi Metrics

Posted by "Owens, Mark" <jm...@evoforge.org>.
Hi Yolanda,



I've been working on a feature that appears to possibly overlap with the work you are pursuing. Perhaps we should see if/should we try to coordinate our efforts. I've been updating NiFi to predict the time to queue overflow for both flowfiles and bytes and displaying that information in the GUI. For the initial attempt, I’ve been using a simple model of straight line prediction over a sliding window of 15 minutes to predict when flows will fail. This estimate is then displayed on both the NiFi Summary page under the connections tab and in the status history graphs.  Below are examples of what would be displayed to the user.



[cid:image001.png@01D55696.E4CCD550]



The Connection tab contains a new column on the right that displays the prediction for both flow files and data size. The user can select a maximum time at which specific times are no longer displayed. In this example, if the prediction lies beyond 12 hours then the display simply indicates that the flow is greater than 12 hours away from failure at the moment.



[cid:image002.png@01D55697.2C8AC500]



This display graphs the prediction for byte overflow over time. Note that if the estimate is greater than the user provided maximum value of interest the graph maxes out at that time, effectively indicating no overflow concerns.



[cid:image003.png@01D55697.965C27D0]



A similar display for flowfile count is displayed as well.



The current state of work can be found at https://github.com/jmark99/nifi/tree/time-to-overflow



I welcome your (or any others) feedback on this effort.



Thanks,
Mark



P.S. If the images are not displaying, they can be viewed at https://github.com/jmark99/nifi-images







-----Original Message-----
From: Yolanda Davis <yo...@gmail.com>
Sent: Monday, August 19, 2019 11:29 AM
To: dev@nifi.apache.org
Subject: Re:[EXT] [DISCUSS] Predictive Analytics for NiFi Metrics



Hello All,



I just wanted to follow up on the discussion we started a couple of weeks ago concerning an analytics framework for NiFi metrics.  Working with Andy Christianson and Matt Burgess we shaped our ideas and drafted a proposal for this feature on the Apache NiFi Wiki [1] . We've also begun implementing some of these ideas in a feature branch (which is work in

progress) [2].  We’d appreciate any questions or feedback you may have.



Thanks,



-yolanda



[1] -

https://cwiki.apache.org/confluence/display/NIFI/Operational+Analytics+Framework+for+NiFi

[2] - https://github.com/apache/nifi/commits/analytics-framework



On Wed, Jul 31, 2019 at 9:58 AM Andy Christianson <ai...@protonmail.com.invalid>> wrote:



> As someone who operated a 24/7 mission-critical NiFi flow, this

> feature would have been a life saver. If I'm heading home on a Friday,

> it would be great to have some blinking red lights to let me know that

> the system predicts that it is going to experience backpressure

> sometime over the weekend, so that corrective action could be taken before leaving.

>

> Since there is support in the community for this, I created a JIRA to

> track the effort:

>

> https://issues.apache.org/jira/browse/NIFI-6510

>

> I also created a JIRA to track the remote protocol:

>

> https://issues.apache.org/jira/browse/NIFI-6511

>

>

> Regards,

>

> Andy

>

>

> Sent from ProtonMail, Swiss-based encrypted email.

>

> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐

> On Wednesday, July 31, 2019 6:57 AM, Arpad Boda <ab...@apache.org>> wrote:

>

> > If you could share a bit more details about your OPC and Modbus

> > usage,

> that

> > would be highly appreciated!

> >

> > On Wed, Jul 31, 2019 at 12:01 PM Craig Knell craig.knell@gmail.com<ma...@gmail.com>

> wrote:

> >

> > > Sounds. Great

> > > Let me know if you need some help

> > > Best regards

> > > Craig

> > >

> > > > On 31 Jul 2019, at 17:31, Arpad Boda aboda@cloudera.com.invalid<ma...@cloudera.com.invalid>

> wrote:

> > > > Craig,

> > > > OPC ( https://issues.apache.org/jira/browse/MINIFICPP-819 ) and

> Modbus (

> > > > https://issues.apache.org/jira/browse/MINIFICPP-897 ) are on the

> way for

> > > > MiNiFi c++, hopefully both will be part of next release (0.7.0).

> > > > It's gonna be legen... wait for it! :) Regards, Arpad

> > > >

> > > > > On Wed, Jul 31, 2019 at 2:30 AM Craig Knell

> > > > > craig.knell@gmail.com<ma...@gmail.com>

> > > > > wrote:

> > > >

> > > > > Hi Folks

> > > > > That's our use case now. All our Models are run in python.

> > > > > Currently we send events to the ML via http, although this is

> > > > > not optimal

> > > >

> > > > > Our use case is edge ML where we want a light weight wrapper

> > > > > for Python code base.

> > > > > Jython however does not work with the code base I'm think of

> > > > > changing the interface to some thing like REDIS for

> pub/sub

> > > > > Id also like this to be a push deployment via minifi Also

> > > > > support for sensors via protocols via Modbus and OPC would be

> great

> > > > > Craig

> > > > >

> > > > > > On Wed, Jul 31, 2019 at 1:43 AM Joe Witt joe.witt@gmail.com<ma...@gmail.com>

> wrote:

> > > > > > Definitely something that I think would really help the

> community. It

> > > > > > might make sense to frame/structure these APIs such that an

> internal

> > > > > > option

> > > > > > could be available to reduce dependencies and get up and

> > > > > > running

> but

> > > > > > that

> > > >

> > > > > > also just as easily a remote implementation where the engine

> lives and

> > > > > > is

> > > >

> > > > > > managed externally could also be supported.

> > > > > > Thanks

> > > > > > On Tue, Jul 30, 2019 at 1:40 PM Andy LoPresto

> alopresto@apache.org<ma...@apache.org>

> > > > > > wrote:

> > > > > >

> > > > > > > Yolanda,

> > > > > > > I think this sounds like a great idea and will be very

> > > > > > > useful

> to

> > > > > > > admins/users, as well as enabling some interesting

> > > > > > > next-level functionality

> > > > > >

> > > > > > > and insight generation. Thanks for putting this out there.

> > > > > > > Andy LoPresto

> > > > > > > alopresto@apache.org<ma...@apache.org>

> > > > > > > alopresto.apache@gmail.com<ma...@gmail.com> PGP Fingerprint: 70EC B3E5 98A6

> > > > > > > 5A3F D3C4 BACE 3C6E F65B 2F7D

> EF69

> > > > > > >

> > > > > > > > On Jul 30, 2019, at 5:55 AM, Yolanda Davis <

> > > > > > > > yolanda.m.davis@gmail.com<ma...@gmail.com>>

> > > > > >

> > > > > > > wrote:

> > > > > > >

> > > > > > > > Hello Everyone,

> > > > > > > > I wanted to reach out to the community to discuss

> > > > > > > > potentially enhancing

> > > > > >

> > > > > > > > NiFi to include predictive analytics that can help users

> assess and

> > > > > > > > predict

> > > > > > > > NiFi behavior and performance. Currently NiFi has lots

> > > > > > > > of

> metrics

> > > > > > > > available

> > > > > > > > for areas including jvm and flow component usage (via

> component

> > > > > > > > status)

> > > > > >

> > > > > > > as

> > > > > > >

> > > > > > > > well as provenance data which NiFi makes available

> > > > > > > > either

> through

> > > > > > > > the UI

> > > > > >

> > > > > > > or

> > > > > > >

> > > > > > > > reporting tasks (for consumption by other systems). Past

> discussions

> > > > > > > > in

> > > > > >

> > > > > > > the

> > > > > > >

> > > > > > > > community cite users shipping this data to applications

> > > > > > > > such

> as

> > > > > > > > Prometheus,

> > > > > > > > ELK stacks, or Ambari metrics for further analysis in

> > > > > > > > order

> to

> > > > > > > > capture/review performance issues, detect anomalies, and

> send alerts

> > > > > > > > or

> > > > > >

> > > > > > > > notifications. These systems are efficient in capturing

> > > > > > > > and

> helping

> > > > > > > > to

> > > > > >

> > > > > > > > analyze these metrics however it requires customization

> > > > > > > > work

> and

> > > > > > > > knowledge

> > > > > > > > of NiFi operations to provide meaningful analytics

> > > > > > > > within a

> flow

> > > > > > > > context.

> > > > > >

> > > > > > > > In speaking with Matt Burgess and Andy Christianson on

> > > > > > > > this

> topic we

> > > > > > > > feel

> > > > > >

> > > > > > > > that there is an opportunity to introduce an analytics

> framework that

> > > > > > > > could

> > > > > > > > provide users reasonable predictions on key performance

> indicators

> > > > > > > > for

> > > > > >

> > > > > > > > flows, such as back pressure and flow rate, to help

> administrators

> > > > > > > > improve

> > > > > > > > operational management of NiFi clusters. This framework

> could offer

> > > > > > > > several key features:

> > > > > > > >

> > > > > > > > -   Provide a flexible internal analytics engine and model

> api which

> > > > > > > >     supports the addition of or enhancement to onboard

> > > > > > > > models

> > > > > > > >

> > > > > > > > -   Support integration of remote or cloud based ML models

> > > > > > > > -   Support both traditional and online (incremental)

> learning

> > > > > > > >     methods

> > > > > > > >

> > > > > >

> > > > > > > > -   Provide support for model caching (perhaps later

> inclusion into

> > > > > > > >     a

> > > > > > > >

> > > > > >

> > > > > > > > model repository or registry)

> > > > > > > >

> > > > > > > > -   UI enhancements to display prediction information either

> in

> > > > > > > >     existing

> > > > > > > >

> > > > > >

> > > > > > > > summary data, new data visualizations, or directly

> > > > > > > > within the flow/canvas (where applicable) For an initial

> > > > > > > > target we thought that back pressure

> prediction would

> > > > > > > > be a

> > > > > >

> > > > > > > > good starting point for this initiative, given that back

> pressure

> > > > > > > > detection

> > > > > > > > is a key indicator of flow performance and many of the

> metrics

> > > > > > > > currently

> > > > > >

> > > > > > > > available would provide enough data points to create a

> reasonable

> > > > > > > > performing model. We have some ideas on how this could

> > > > > > > > be

> achieved

> > > > > > > > however

> > > > > > > > we wanted to discuss this more with the community to get

> thoughts

> > > > > > > > about

> > > > > >

> > > > > > > > tackling this work, especially if there are specific use

> cases or

> > > > > > > > other

> > > > > >

> > > > > > > > factors that should be considered.

> > > > > > > > Looking forward to everyone's thoughts and input.

> > > > > > > > Thanks,

> > > > > > > > -yolanda

> > > > > > > > --

> > > > > > > > yolanda.m.davis@gmail.com<ma...@gmail.com> @YolandaMDavis

> > > > >

> > > > > --

> > > > > Regards

> > > > > Craig Knell

> > > > > Mobile: +61 402 128 615

> > > > > Skype: craigknell

>

>

>



--

--

yolanda.m.davis@gmail.com<ma...@gmail.com>

@YolandaMDavis

Re:[EXT] [DISCUSS] Predictive Analytics for NiFi Metrics

Posted by Yolanda Davis <yo...@gmail.com>.
Hello All,

I just wanted to follow up on the discussion we started a couple of weeks
ago concerning an analytics framework for NiFi metrics.  Working with Andy
Christianson and Matt Burgess we shaped our ideas and drafted a proposal
for this feature on the Apache NiFi Wiki [1] . We've also begun
implementing some of these ideas in a feature branch (which is work in
progress) [2].  We’d appreciate any questions or feedback you may have.

Thanks,

-yolanda

[1] -
https://cwiki.apache.org/confluence/display/NIFI/Operational+Analytics+Framework+for+NiFi
[2] - https://github.com/apache/nifi/commits/analytics-framework

On Wed, Jul 31, 2019 at 9:58 AM Andy Christianson
<ai...@protonmail.com.invalid> wrote:

> As someone who operated a 24/7 mission-critical NiFi flow, this feature
> would have been a life saver. If I'm heading home on a Friday, it would be
> great to have some blinking red lights to let me know that the system
> predicts that it is going to experience backpressure sometime over the
> weekend, so that corrective action could be taken before leaving.
>
> Since there is support in the community for this, I created a JIRA to
> track the effort:
>
> https://issues.apache.org/jira/browse/NIFI-6510
>
> I also created a JIRA to track the remote protocol:
>
> https://issues.apache.org/jira/browse/NIFI-6511
>
>
> Regards,
>
> Andy
>
>
> Sent from ProtonMail, Swiss-based encrypted email.
>
> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> On Wednesday, July 31, 2019 6:57 AM, Arpad Boda <ab...@apache.org> wrote:
>
> > If you could share a bit more details about your OPC and Modbus usage,
> that
> > would be highly appreciated!
> >
> > On Wed, Jul 31, 2019 at 12:01 PM Craig Knell craig.knell@gmail.com
> wrote:
> >
> > > Sounds. Great
> > > Let me know if you need some help
> > > Best regards
> > > Craig
> > >
> > > > On 31 Jul 2019, at 17:31, Arpad Boda aboda@cloudera.com.invalid
> wrote:
> > > > Craig,
> > > > OPC ( https://issues.apache.org/jira/browse/MINIFICPP-819 ) and
> Modbus (
> > > > https://issues.apache.org/jira/browse/MINIFICPP-897 ) are on the
> way for
> > > > MiNiFi c++, hopefully both will be part of next release (0.7.0).
> > > > It's gonna be legen... wait for it! :)
> > > > Regards,
> > > > Arpad
> > > >
> > > > > On Wed, Jul 31, 2019 at 2:30 AM Craig Knell craig.knell@gmail.com
> > > > > wrote:
> > > >
> > > > > Hi Folks
> > > > > That's our use case now. All our Models are run in python.
> > > > > Currently we send events to the ML via http, although this is not
> > > > > optimal
> > > >
> > > > > Our use case is edge ML where we want a light weight wrapper for
> > > > > Python code base.
> > > > > Jython however does not work with the code base
> > > > > I'm think of changing the interface to some thing like REDIS for
> pub/sub
> > > > > Id also like this to be a push deployment via minifi
> > > > > Also support for sensors via protocols via Modbus and OPC would be
> great
> > > > > Craig
> > > > >
> > > > > > On Wed, Jul 31, 2019 at 1:43 AM Joe Witt joe.witt@gmail.com
> wrote:
> > > > > > Definitely something that I think would really help the
> community. It
> > > > > > might make sense to frame/structure these APIs such that an
> internal
> > > > > > option
> > > > > > could be available to reduce dependencies and get up and running
> but
> > > > > > that
> > > >
> > > > > > also just as easily a remote implementation where the engine
> lives and
> > > > > > is
> > > >
> > > > > > managed externally could also be supported.
> > > > > > Thanks
> > > > > > On Tue, Jul 30, 2019 at 1:40 PM Andy LoPresto
> alopresto@apache.org
> > > > > > wrote:
> > > > > >
> > > > > > > Yolanda,
> > > > > > > I think this sounds like a great idea and will be very useful
> to
> > > > > > > admins/users, as well as enabling some interesting next-level
> > > > > > > functionality
> > > > > >
> > > > > > > and insight generation. Thanks for putting this out there.
> > > > > > > Andy LoPresto
> > > > > > > alopresto@apache.org
> > > > > > > alopresto.apache@gmail.com
> > > > > > > PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4 BACE 3C6E F65B 2F7D
> EF69
> > > > > > >
> > > > > > > > On Jul 30, 2019, at 5:55 AM, Yolanda Davis <
> > > > > > > > yolanda.m.davis@gmail.com>
> > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hello Everyone,
> > > > > > > > I wanted to reach out to the community to discuss potentially
> > > > > > > > enhancing
> > > > > >
> > > > > > > > NiFi to include predictive analytics that can help users
> assess and
> > > > > > > > predict
> > > > > > > > NiFi behavior and performance. Currently NiFi has lots of
> metrics
> > > > > > > > available
> > > > > > > > for areas including jvm and flow component usage (via
> component
> > > > > > > > status)
> > > > > >
> > > > > > > as
> > > > > > >
> > > > > > > > well as provenance data which NiFi makes available either
> through
> > > > > > > > the UI
> > > > > >
> > > > > > > or
> > > > > > >
> > > > > > > > reporting tasks (for consumption by other systems). Past
> discussions
> > > > > > > > in
> > > > > >
> > > > > > > the
> > > > > > >
> > > > > > > > community cite users shipping this data to applications such
> as
> > > > > > > > Prometheus,
> > > > > > > > ELK stacks, or Ambari metrics for further analysis in order
> to
> > > > > > > > capture/review performance issues, detect anomalies, and
> send alerts
> > > > > > > > or
> > > > > >
> > > > > > > > notifications. These systems are efficient in capturing and
> helping
> > > > > > > > to
> > > > > >
> > > > > > > > analyze these metrics however it requires customization work
> and
> > > > > > > > knowledge
> > > > > > > > of NiFi operations to provide meaningful analytics within a
> flow
> > > > > > > > context.
> > > > > >
> > > > > > > > In speaking with Matt Burgess and Andy Christianson on this
> topic we
> > > > > > > > feel
> > > > > >
> > > > > > > > that there is an opportunity to introduce an analytics
> framework that
> > > > > > > > could
> > > > > > > > provide users reasonable predictions on key performance
> indicators
> > > > > > > > for
> > > > > >
> > > > > > > > flows, such as back pressure and flow rate, to help
> administrators
> > > > > > > > improve
> > > > > > > > operational management of NiFi clusters. This framework
> could offer
> > > > > > > > several key features:
> > > > > > > >
> > > > > > > > -   Provide a flexible internal analytics engine and model
> api which
> > > > > > > >     supports the addition of or enhancement to onboard models
> > > > > > > >
> > > > > > > > -   Support integration of remote or cloud based ML models
> > > > > > > > -   Support both traditional and online (incremental)
> learning
> > > > > > > >     methods
> > > > > > > >
> > > > > >
> > > > > > > > -   Provide support for model caching (perhaps later
> inclusion into
> > > > > > > >     a
> > > > > > > >
> > > > > >
> > > > > > > > model repository or registry)
> > > > > > > >
> > > > > > > > -   UI enhancements to display prediction information either
> in
> > > > > > > >     existing
> > > > > > > >
> > > > > >
> > > > > > > > summary data, new data visualizations, or directly within the
> > > > > > > > flow/canvas
> > > > > > > > (where applicable)
> > > > > > > > For an initial target we thought that back pressure
> prediction would
> > > > > > > > be a
> > > > > >
> > > > > > > > good starting point for this initiative, given that back
> pressure
> > > > > > > > detection
> > > > > > > > is a key indicator of flow performance and many of the
> metrics
> > > > > > > > currently
> > > > > >
> > > > > > > > available would provide enough data points to create a
> reasonable
> > > > > > > > performing model. We have some ideas on how this could be
> achieved
> > > > > > > > however
> > > > > > > > we wanted to discuss this more with the community to get
> thoughts
> > > > > > > > about
> > > > > >
> > > > > > > > tackling this work, especially if there are specific use
> cases or
> > > > > > > > other
> > > > > >
> > > > > > > > factors that should be considered.
> > > > > > > > Looking forward to everyone's thoughts and input.
> > > > > > > > Thanks,
> > > > > > > > -yolanda
> > > > > > > > --
> > > > > > > > yolanda.m.davis@gmail.com
> > > > > > > > @YolandaMDavis
> > > > >
> > > > > --
> > > > > Regards
> > > > > Craig Knell
> > > > > Mobile: +61 402 128 615
> > > > > Skype: craigknell
>
>
>

-- 
--
yolanda.m.davis@gmail.com
@YolandaMDavis

Re: [DISCUSS] Predictive Analytics for NiFi Metrics

Posted by Andy Christianson <ai...@protonmail.com.INVALID>.
As someone who operated a 24/7 mission-critical NiFi flow, this feature would have been a life saver. If I'm heading home on a Friday, it would be great to have some blinking red lights to let me know that the system predicts that it is going to experience backpressure sometime over the weekend, so that corrective action could be taken before leaving.

Since there is support in the community for this, I created a JIRA to track the effort:

https://issues.apache.org/jira/browse/NIFI-6510

I also created a JIRA to track the remote protocol:

https://issues.apache.org/jira/browse/NIFI-6511


Regards,

Andy


Sent from ProtonMail, Swiss-based encrypted email.

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Wednesday, July 31, 2019 6:57 AM, Arpad Boda <ab...@apache.org> wrote:

> If you could share a bit more details about your OPC and Modbus usage, that
> would be highly appreciated!
>
> On Wed, Jul 31, 2019 at 12:01 PM Craig Knell craig.knell@gmail.com wrote:
>
> > Sounds. Great
> > Let me know if you need some help
> > Best regards
> > Craig
> >
> > > On 31 Jul 2019, at 17:31, Arpad Boda aboda@cloudera.com.invalid wrote:
> > > Craig,
> > > OPC ( https://issues.apache.org/jira/browse/MINIFICPP-819 ) and Modbus (
> > > https://issues.apache.org/jira/browse/MINIFICPP-897 ) are on the way for
> > > MiNiFi c++, hopefully both will be part of next release (0.7.0).
> > > It's gonna be legen... wait for it! :)
> > > Regards,
> > > Arpad
> > >
> > > > On Wed, Jul 31, 2019 at 2:30 AM Craig Knell craig.knell@gmail.com
> > > > wrote:
> > >
> > > > Hi Folks
> > > > That's our use case now. All our Models are run in python.
> > > > Currently we send events to the ML via http, although this is not
> > > > optimal
> > >
> > > > Our use case is edge ML where we want a light weight wrapper for
> > > > Python code base.
> > > > Jython however does not work with the code base
> > > > I'm think of changing the interface to some thing like REDIS for pub/sub
> > > > Id also like this to be a push deployment via minifi
> > > > Also support for sensors via protocols via Modbus and OPC would be great
> > > > Craig
> > > >
> > > > > On Wed, Jul 31, 2019 at 1:43 AM Joe Witt joe.witt@gmail.com wrote:
> > > > > Definitely something that I think would really help the community. It
> > > > > might make sense to frame/structure these APIs such that an internal
> > > > > option
> > > > > could be available to reduce dependencies and get up and running but
> > > > > that
> > >
> > > > > also just as easily a remote implementation where the engine lives and
> > > > > is
> > >
> > > > > managed externally could also be supported.
> > > > > Thanks
> > > > > On Tue, Jul 30, 2019 at 1:40 PM Andy LoPresto alopresto@apache.org
> > > > > wrote:
> > > > >
> > > > > > Yolanda,
> > > > > > I think this sounds like a great idea and will be very useful to
> > > > > > admins/users, as well as enabling some interesting next-level
> > > > > > functionality
> > > > >
> > > > > > and insight generation. Thanks for putting this out there.
> > > > > > Andy LoPresto
> > > > > > alopresto@apache.org
> > > > > > alopresto.apache@gmail.com
> > > > > > PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4 BACE 3C6E F65B 2F7D EF69
> > > > > >
> > > > > > > On Jul 30, 2019, at 5:55 AM, Yolanda Davis <
> > > > > > > yolanda.m.davis@gmail.com>
> > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hello Everyone,
> > > > > > > I wanted to reach out to the community to discuss potentially
> > > > > > > enhancing
> > > > >
> > > > > > > NiFi to include predictive analytics that can help users assess and
> > > > > > > predict
> > > > > > > NiFi behavior and performance. Currently NiFi has lots of metrics
> > > > > > > available
> > > > > > > for areas including jvm and flow component usage (via component
> > > > > > > status)
> > > > >
> > > > > > as
> > > > > >
> > > > > > > well as provenance data which NiFi makes available either through
> > > > > > > the UI
> > > > >
> > > > > > or
> > > > > >
> > > > > > > reporting tasks (for consumption by other systems). Past discussions
> > > > > > > in
> > > > >
> > > > > > the
> > > > > >
> > > > > > > community cite users shipping this data to applications such as
> > > > > > > Prometheus,
> > > > > > > ELK stacks, or Ambari metrics for further analysis in order to
> > > > > > > capture/review performance issues, detect anomalies, and send alerts
> > > > > > > or
> > > > >
> > > > > > > notifications. These systems are efficient in capturing and helping
> > > > > > > to
> > > > >
> > > > > > > analyze these metrics however it requires customization work and
> > > > > > > knowledge
> > > > > > > of NiFi operations to provide meaningful analytics within a flow
> > > > > > > context.
> > > > >
> > > > > > > In speaking with Matt Burgess and Andy Christianson on this topic we
> > > > > > > feel
> > > > >
> > > > > > > that there is an opportunity to introduce an analytics framework that
> > > > > > > could
> > > > > > > provide users reasonable predictions on key performance indicators
> > > > > > > for
> > > > >
> > > > > > > flows, such as back pressure and flow rate, to help administrators
> > > > > > > improve
> > > > > > > operational management of NiFi clusters. This framework could offer
> > > > > > > several key features:
> > > > > > >
> > > > > > > -   Provide a flexible internal analytics engine and model api which
> > > > > > >     supports the addition of or enhancement to onboard models
> > > > > > >
> > > > > > > -   Support integration of remote or cloud based ML models
> > > > > > > -   Support both traditional and online (incremental) learning
> > > > > > >     methods
> > > > > > >
> > > > >
> > > > > > > -   Provide support for model caching (perhaps later inclusion into
> > > > > > >     a
> > > > > > >
> > > > >
> > > > > > > model repository or registry)
> > > > > > >
> > > > > > > -   UI enhancements to display prediction information either in
> > > > > > >     existing
> > > > > > >
> > > > >
> > > > > > > summary data, new data visualizations, or directly within the
> > > > > > > flow/canvas
> > > > > > > (where applicable)
> > > > > > > For an initial target we thought that back pressure prediction would
> > > > > > > be a
> > > > >
> > > > > > > good starting point for this initiative, given that back pressure
> > > > > > > detection
> > > > > > > is a key indicator of flow performance and many of the metrics
> > > > > > > currently
> > > > >
> > > > > > > available would provide enough data points to create a reasonable
> > > > > > > performing model. We have some ideas on how this could be achieved
> > > > > > > however
> > > > > > > we wanted to discuss this more with the community to get thoughts
> > > > > > > about
> > > > >
> > > > > > > tackling this work, especially if there are specific use cases or
> > > > > > > other
> > > > >
> > > > > > > factors that should be considered.
> > > > > > > Looking forward to everyone's thoughts and input.
> > > > > > > Thanks,
> > > > > > > -yolanda
> > > > > > > --
> > > > > > > yolanda.m.davis@gmail.com
> > > > > > > @YolandaMDavis
> > > >
> > > > --
> > > > Regards
> > > > Craig Knell
> > > > Mobile: +61 402 128 615
> > > > Skype: craigknell



Re: [DISCUSS] Predictive Analytics for NiFi Metrics

Posted by Arpad Boda <ab...@apache.org>.
If you could share a bit more details about your OPC and Modbus usage, that
would be highly appreciated!

On Wed, Jul 31, 2019 at 12:01 PM Craig Knell <cr...@gmail.com> wrote:

> Sounds. Great
>
> Let me know if you need some help
>
> Best regards
>
> Craig
>
>
>
> > On 31 Jul 2019, at 17:31, Arpad Boda <ab...@cloudera.com.invalid> wrote:
> >
> > Craig,
> >
> > OPC ( https://issues.apache.org/jira/browse/MINIFICPP-819 ) and Modbus (
> > https://issues.apache.org/jira/browse/MINIFICPP-897 ) are on the way for
> > MiNiFi c++, hopefully both will be part of next release (0.7.0).
> > It's gonna be legen... wait for it! :)
> >
> > Regards,
> > Arpad
> >
> >> On Wed, Jul 31, 2019 at 2:30 AM Craig Knell <cr...@gmail.com>
> wrote:
> >>
> >> Hi Folks
> >>
> >> That's our use case now.  All our Models are run in python.
> >> Currently we send events to the ML via http, although this is not
> optimal
> >>
> >> Our use case is edge ML where we want a light weight wrapper for
> >> Python code base.
> >> Jython however does not work with the code base
> >> I'm think of changing the interface to some thing like REDIS for pub/sub
> >> Id also like this to be a push deployment via minifi
> >>
> >> Also support for sensors via protocols via Modbus and OPC would be great
> >>
> >> Craig
> >>
> >>> On Wed, Jul 31, 2019 at 1:43 AM Joe Witt <jo...@gmail.com> wrote:
> >>>
> >>> Definitely something that I think would really help the community.  It
> >>> might make sense to frame/structure these APIs such that an internal
> >> option
> >>> could be available to reduce dependencies and get up and running but
> that
> >>> also just as easily a remote implementation where the engine lives and
> is
> >>> managed externally could also be supported.
> >>>
> >>> Thanks
> >>>
> >>>
> >>> On Tue, Jul 30, 2019 at 1:40 PM Andy LoPresto <al...@apache.org>
> >> wrote:
> >>>
> >>>> Yolanda,
> >>>>
> >>>> I think this sounds like a great idea and will be very useful to
> >>>> admins/users, as well as enabling some interesting next-level
> >> functionality
> >>>> and insight generation. Thanks for putting this out there.
> >>>>
> >>>> Andy LoPresto
> >>>> alopresto@apache.org
> >>>> alopresto.apache@gmail.com
> >>>> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
> >>>>
> >>>>> On Jul 30, 2019, at 5:55 AM, Yolanda Davis <
> >> yolanda.m.davis@gmail.com>
> >>>> wrote:
> >>>>>
> >>>>> Hello Everyone,
> >>>>>
> >>>>> I wanted to reach out to the community to discuss potentially
> >> enhancing
> >>>>> NiFi to include predictive analytics that can help users assess and
> >>>> predict
> >>>>> NiFi behavior and performance. Currently NiFi has lots of metrics
> >>>> available
> >>>>> for areas including jvm and flow component usage (via component
> >> status)
> >>>> as
> >>>>> well as provenance data which NiFi makes available either through
> >> the UI
> >>>> or
> >>>>> reporting tasks (for consumption by other systems). Past discussions
> >> in
> >>>> the
> >>>>> community cite users shipping this data to applications such as
> >>>> Prometheus,
> >>>>> ELK stacks, or Ambari metrics for further analysis in order to
> >>>>> capture/review performance issues, detect anomalies, and send alerts
> >> or
> >>>>> notifications.  These systems are efficient in capturing and helping
> >> to
> >>>>> analyze these metrics however it requires customization work and
> >>>> knowledge
> >>>>> of NiFi operations to provide meaningful analytics within a flow
> >> context.
> >>>>>
> >>>>> In speaking with Matt Burgess and Andy Christianson on this topic we
> >> feel
> >>>>> that there is an opportunity to introduce an analytics framework that
> >>>> could
> >>>>> provide users reasonable predictions on key performance indicators
> >> for
> >>>>> flows, such as back pressure and flow rate, to help administrators
> >>>> improve
> >>>>> operational management of NiFi clusters.  This framework could offer
> >>>>> several key features:
> >>>>>
> >>>>>  - Provide a flexible internal analytics engine and model api which
> >>>>>  supports the addition of or enhancement to onboard models
> >>>>>  - Support integration of remote or cloud based ML models
> >>>>>  - Support both traditional and online (incremental) learning
> >> methods
> >>>>>  - Provide support for model caching  (perhaps later inclusion into
> >> a
> >>>>>  model repository or registry)
> >>>>>  - UI enhancements to display prediction information either in
> >> existing
> >>>>>  summary data, new data visualizations, or directly within the
> >>>> flow/canvas
> >>>>>  (where applicable)
> >>>>>
> >>>>> For an initial target we thought that back pressure prediction would
> >> be a
> >>>>> good starting point for this initiative, given that back pressure
> >>>> detection
> >>>>> is a key indicator of flow performance and many of the metrics
> >> currently
> >>>>> available would provide enough data points to create a reasonable
> >>>>> performing model.  We have some ideas on how this could be achieved
> >>>> however
> >>>>> we wanted to discuss this more with the community to get thoughts
> >> about
> >>>>> tackling this work, especially if there are specific use cases or
> >> other
> >>>>> factors that should be considered.
> >>>>>
> >>>>> Looking forward to everyone's thoughts and input.
> >>>>>
> >>>>> Thanks,
> >>>>>
> >>>>> -yolanda
> >>>>>
> >>>>> --
> >>>>> yolanda.m.davis@gmail.com
> >>>>> @YolandaMDavis
> >>>>
> >>>>
> >>
> >>
> >>
> >> --
> >> Regards
> >>
> >> Craig Knell
> >> Mobile: +61 402 128 615
> >> Skype: craigknell
> >>
>

Re: [DISCUSS] Predictive Analytics for NiFi Metrics

Posted by Craig Knell <cr...@gmail.com>.
Sounds. Great

Let me know if you need some help

Best regards

Craig 



> On 31 Jul 2019, at 17:31, Arpad Boda <ab...@cloudera.com.invalid> wrote:
> 
> Craig,
> 
> OPC ( https://issues.apache.org/jira/browse/MINIFICPP-819 ) and Modbus (
> https://issues.apache.org/jira/browse/MINIFICPP-897 ) are on the way for
> MiNiFi c++, hopefully both will be part of next release (0.7.0).
> It's gonna be legen... wait for it! :)
> 
> Regards,
> Arpad
> 
>> On Wed, Jul 31, 2019 at 2:30 AM Craig Knell <cr...@gmail.com> wrote:
>> 
>> Hi Folks
>> 
>> That's our use case now.  All our Models are run in python.
>> Currently we send events to the ML via http, although this is not optimal
>> 
>> Our use case is edge ML where we want a light weight wrapper for
>> Python code base.
>> Jython however does not work with the code base
>> I'm think of changing the interface to some thing like REDIS for pub/sub
>> Id also like this to be a push deployment via minifi
>> 
>> Also support for sensors via protocols via Modbus and OPC would be great
>> 
>> Craig
>> 
>>> On Wed, Jul 31, 2019 at 1:43 AM Joe Witt <jo...@gmail.com> wrote:
>>> 
>>> Definitely something that I think would really help the community.  It
>>> might make sense to frame/structure these APIs such that an internal
>> option
>>> could be available to reduce dependencies and get up and running but that
>>> also just as easily a remote implementation where the engine lives and is
>>> managed externally could also be supported.
>>> 
>>> Thanks
>>> 
>>> 
>>> On Tue, Jul 30, 2019 at 1:40 PM Andy LoPresto <al...@apache.org>
>> wrote:
>>> 
>>>> Yolanda,
>>>> 
>>>> I think this sounds like a great idea and will be very useful to
>>>> admins/users, as well as enabling some interesting next-level
>> functionality
>>>> and insight generation. Thanks for putting this out there.
>>>> 
>>>> Andy LoPresto
>>>> alopresto@apache.org
>>>> alopresto.apache@gmail.com
>>>> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>>>> 
>>>>> On Jul 30, 2019, at 5:55 AM, Yolanda Davis <
>> yolanda.m.davis@gmail.com>
>>>> wrote:
>>>>> 
>>>>> Hello Everyone,
>>>>> 
>>>>> I wanted to reach out to the community to discuss potentially
>> enhancing
>>>>> NiFi to include predictive analytics that can help users assess and
>>>> predict
>>>>> NiFi behavior and performance. Currently NiFi has lots of metrics
>>>> available
>>>>> for areas including jvm and flow component usage (via component
>> status)
>>>> as
>>>>> well as provenance data which NiFi makes available either through
>> the UI
>>>> or
>>>>> reporting tasks (for consumption by other systems). Past discussions
>> in
>>>> the
>>>>> community cite users shipping this data to applications such as
>>>> Prometheus,
>>>>> ELK stacks, or Ambari metrics for further analysis in order to
>>>>> capture/review performance issues, detect anomalies, and send alerts
>> or
>>>>> notifications.  These systems are efficient in capturing and helping
>> to
>>>>> analyze these metrics however it requires customization work and
>>>> knowledge
>>>>> of NiFi operations to provide meaningful analytics within a flow
>> context.
>>>>> 
>>>>> In speaking with Matt Burgess and Andy Christianson on this topic we
>> feel
>>>>> that there is an opportunity to introduce an analytics framework that
>>>> could
>>>>> provide users reasonable predictions on key performance indicators
>> for
>>>>> flows, such as back pressure and flow rate, to help administrators
>>>> improve
>>>>> operational management of NiFi clusters.  This framework could offer
>>>>> several key features:
>>>>> 
>>>>>  - Provide a flexible internal analytics engine and model api which
>>>>>  supports the addition of or enhancement to onboard models
>>>>>  - Support integration of remote or cloud based ML models
>>>>>  - Support both traditional and online (incremental) learning
>> methods
>>>>>  - Provide support for model caching  (perhaps later inclusion into
>> a
>>>>>  model repository or registry)
>>>>>  - UI enhancements to display prediction information either in
>> existing
>>>>>  summary data, new data visualizations, or directly within the
>>>> flow/canvas
>>>>>  (where applicable)
>>>>> 
>>>>> For an initial target we thought that back pressure prediction would
>> be a
>>>>> good starting point for this initiative, given that back pressure
>>>> detection
>>>>> is a key indicator of flow performance and many of the metrics
>> currently
>>>>> available would provide enough data points to create a reasonable
>>>>> performing model.  We have some ideas on how this could be achieved
>>>> however
>>>>> we wanted to discuss this more with the community to get thoughts
>> about
>>>>> tackling this work, especially if there are specific use cases or
>> other
>>>>> factors that should be considered.
>>>>> 
>>>>> Looking forward to everyone's thoughts and input.
>>>>> 
>>>>> Thanks,
>>>>> 
>>>>> -yolanda
>>>>> 
>>>>> --
>>>>> yolanda.m.davis@gmail.com
>>>>> @YolandaMDavis
>>>> 
>>>> 
>> 
>> 
>> 
>> --
>> Regards
>> 
>> Craig Knell
>> Mobile: +61 402 128 615
>> Skype: craigknell
>> 

Re: [DISCUSS] Predictive Analytics for NiFi Metrics

Posted by Arpad Boda <ab...@cloudera.com.INVALID>.
Craig,

OPC ( https://issues.apache.org/jira/browse/MINIFICPP-819 ) and Modbus (
https://issues.apache.org/jira/browse/MINIFICPP-897 ) are on the way for
MiNiFi c++, hopefully both will be part of next release (0.7.0).
It's gonna be legen... wait for it! :)

Regards,
Arpad

On Wed, Jul 31, 2019 at 2:30 AM Craig Knell <cr...@gmail.com> wrote:

> Hi Folks
>
> That's our use case now.  All our Models are run in python.
> Currently we send events to the ML via http, although this is not optimal
>
> Our use case is edge ML where we want a light weight wrapper for
> Python code base.
> Jython however does not work with the code base
> I'm think of changing the interface to some thing like REDIS for pub/sub
> Id also like this to be a push deployment via minifi
>
> Also support for sensors via protocols via Modbus and OPC would be great
>
> Craig
>
> On Wed, Jul 31, 2019 at 1:43 AM Joe Witt <jo...@gmail.com> wrote:
> >
> > Definitely something that I think would really help the community.  It
> > might make sense to frame/structure these APIs such that an internal
> option
> > could be available to reduce dependencies and get up and running but that
> > also just as easily a remote implementation where the engine lives and is
> > managed externally could also be supported.
> >
> > Thanks
> >
> >
> > On Tue, Jul 30, 2019 at 1:40 PM Andy LoPresto <al...@apache.org>
> wrote:
> >
> > > Yolanda,
> > >
> > > I think this sounds like a great idea and will be very useful to
> > > admins/users, as well as enabling some interesting next-level
> functionality
> > > and insight generation. Thanks for putting this out there.
> > >
> > > Andy LoPresto
> > > alopresto@apache.org
> > > alopresto.apache@gmail.com
> > > PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
> > >
> > > > On Jul 30, 2019, at 5:55 AM, Yolanda Davis <
> yolanda.m.davis@gmail.com>
> > > wrote:
> > > >
> > > > Hello Everyone,
> > > >
> > > > I wanted to reach out to the community to discuss potentially
> enhancing
> > > > NiFi to include predictive analytics that can help users assess and
> > > predict
> > > > NiFi behavior and performance. Currently NiFi has lots of metrics
> > > available
> > > > for areas including jvm and flow component usage (via component
> status)
> > > as
> > > > well as provenance data which NiFi makes available either through
> the UI
> > > or
> > > > reporting tasks (for consumption by other systems). Past discussions
> in
> > > the
> > > > community cite users shipping this data to applications such as
> > > Prometheus,
> > > > ELK stacks, or Ambari metrics for further analysis in order to
> > > > capture/review performance issues, detect anomalies, and send alerts
> or
> > > > notifications.  These systems are efficient in capturing and helping
> to
> > > > analyze these metrics however it requires customization work and
> > > knowledge
> > > > of NiFi operations to provide meaningful analytics within a flow
> context.
> > > >
> > > > In speaking with Matt Burgess and Andy Christianson on this topic we
> feel
> > > > that there is an opportunity to introduce an analytics framework that
> > > could
> > > > provide users reasonable predictions on key performance indicators
> for
> > > > flows, such as back pressure and flow rate, to help administrators
> > > improve
> > > > operational management of NiFi clusters.  This framework could offer
> > > > several key features:
> > > >
> > > >   - Provide a flexible internal analytics engine and model api which
> > > >   supports the addition of or enhancement to onboard models
> > > >   - Support integration of remote or cloud based ML models
> > > >   - Support both traditional and online (incremental) learning
> methods
> > > >   - Provide support for model caching  (perhaps later inclusion into
> a
> > > >   model repository or registry)
> > > >   - UI enhancements to display prediction information either in
> existing
> > > >   summary data, new data visualizations, or directly within the
> > > flow/canvas
> > > >   (where applicable)
> > > >
> > > > For an initial target we thought that back pressure prediction would
> be a
> > > > good starting point for this initiative, given that back pressure
> > > detection
> > > > is a key indicator of flow performance and many of the metrics
> currently
> > > > available would provide enough data points to create a reasonable
> > > > performing model.  We have some ideas on how this could be achieved
> > > however
> > > > we wanted to discuss this more with the community to get thoughts
> about
> > > > tackling this work, especially if there are specific use cases or
> other
> > > > factors that should be considered.
> > > >
> > > > Looking forward to everyone's thoughts and input.
> > > >
> > > > Thanks,
> > > >
> > > > -yolanda
> > > >
> > > > --
> > > > yolanda.m.davis@gmail.com
> > > > @YolandaMDavis
> > >
> > >
>
>
>
> --
> Regards
>
> Craig Knell
> Mobile: +61 402 128 615
> Skype: craigknell
>

Re: [DISCUSS] Predictive Analytics for NiFi Metrics

Posted by Craig Knell <cr...@gmail.com>.
Hi Folks

That's our use case now.  All our Models are run in python.
Currently we send events to the ML via http, although this is not optimal

Our use case is edge ML where we want a light weight wrapper for
Python code base.
Jython however does not work with the code base
I'm think of changing the interface to some thing like REDIS for pub/sub
Id also like this to be a push deployment via minifi

Also support for sensors via protocols via Modbus and OPC would be great

Craig

On Wed, Jul 31, 2019 at 1:43 AM Joe Witt <jo...@gmail.com> wrote:
>
> Definitely something that I think would really help the community.  It
> might make sense to frame/structure these APIs such that an internal option
> could be available to reduce dependencies and get up and running but that
> also just as easily a remote implementation where the engine lives and is
> managed externally could also be supported.
>
> Thanks
>
>
> On Tue, Jul 30, 2019 at 1:40 PM Andy LoPresto <al...@apache.org> wrote:
>
> > Yolanda,
> >
> > I think this sounds like a great idea and will be very useful to
> > admins/users, as well as enabling some interesting next-level functionality
> > and insight generation. Thanks for putting this out there.
> >
> > Andy LoPresto
> > alopresto@apache.org
> > alopresto.apache@gmail.com
> > PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
> >
> > > On Jul 30, 2019, at 5:55 AM, Yolanda Davis <yo...@gmail.com>
> > wrote:
> > >
> > > Hello Everyone,
> > >
> > > I wanted to reach out to the community to discuss potentially enhancing
> > > NiFi to include predictive analytics that can help users assess and
> > predict
> > > NiFi behavior and performance. Currently NiFi has lots of metrics
> > available
> > > for areas including jvm and flow component usage (via component status)
> > as
> > > well as provenance data which NiFi makes available either through the UI
> > or
> > > reporting tasks (for consumption by other systems). Past discussions in
> > the
> > > community cite users shipping this data to applications such as
> > Prometheus,
> > > ELK stacks, or Ambari metrics for further analysis in order to
> > > capture/review performance issues, detect anomalies, and send alerts or
> > > notifications.  These systems are efficient in capturing and helping to
> > > analyze these metrics however it requires customization work and
> > knowledge
> > > of NiFi operations to provide meaningful analytics within a flow context.
> > >
> > > In speaking with Matt Burgess and Andy Christianson on this topic we feel
> > > that there is an opportunity to introduce an analytics framework that
> > could
> > > provide users reasonable predictions on key performance indicators for
> > > flows, such as back pressure and flow rate, to help administrators
> > improve
> > > operational management of NiFi clusters.  This framework could offer
> > > several key features:
> > >
> > >   - Provide a flexible internal analytics engine and model api which
> > >   supports the addition of or enhancement to onboard models
> > >   - Support integration of remote or cloud based ML models
> > >   - Support both traditional and online (incremental) learning methods
> > >   - Provide support for model caching  (perhaps later inclusion into a
> > >   model repository or registry)
> > >   - UI enhancements to display prediction information either in existing
> > >   summary data, new data visualizations, or directly within the
> > flow/canvas
> > >   (where applicable)
> > >
> > > For an initial target we thought that back pressure prediction would be a
> > > good starting point for this initiative, given that back pressure
> > detection
> > > is a key indicator of flow performance and many of the metrics currently
> > > available would provide enough data points to create a reasonable
> > > performing model.  We have some ideas on how this could be achieved
> > however
> > > we wanted to discuss this more with the community to get thoughts about
> > > tackling this work, especially if there are specific use cases or other
> > > factors that should be considered.
> > >
> > > Looking forward to everyone's thoughts and input.
> > >
> > > Thanks,
> > >
> > > -yolanda
> > >
> > > --
> > > yolanda.m.davis@gmail.com
> > > @YolandaMDavis
> >
> >



-- 
Regards

Craig Knell
Mobile: +61 402 128 615
Skype: craigknell

Re: [DISCUSS] Predictive Analytics for NiFi Metrics

Posted by Joe Witt <jo...@gmail.com>.
Definitely something that I think would really help the community.  It
might make sense to frame/structure these APIs such that an internal option
could be available to reduce dependencies and get up and running but that
also just as easily a remote implementation where the engine lives and is
managed externally could also be supported.

Thanks


On Tue, Jul 30, 2019 at 1:40 PM Andy LoPresto <al...@apache.org> wrote:

> Yolanda,
>
> I think this sounds like a great idea and will be very useful to
> admins/users, as well as enabling some interesting next-level functionality
> and insight generation. Thanks for putting this out there.
>
> Andy LoPresto
> alopresto@apache.org
> alopresto.apache@gmail.com
> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>
> > On Jul 30, 2019, at 5:55 AM, Yolanda Davis <yo...@gmail.com>
> wrote:
> >
> > Hello Everyone,
> >
> > I wanted to reach out to the community to discuss potentially enhancing
> > NiFi to include predictive analytics that can help users assess and
> predict
> > NiFi behavior and performance. Currently NiFi has lots of metrics
> available
> > for areas including jvm and flow component usage (via component status)
> as
> > well as provenance data which NiFi makes available either through the UI
> or
> > reporting tasks (for consumption by other systems). Past discussions in
> the
> > community cite users shipping this data to applications such as
> Prometheus,
> > ELK stacks, or Ambari metrics for further analysis in order to
> > capture/review performance issues, detect anomalies, and send alerts or
> > notifications.  These systems are efficient in capturing and helping to
> > analyze these metrics however it requires customization work and
> knowledge
> > of NiFi operations to provide meaningful analytics within a flow context.
> >
> > In speaking with Matt Burgess and Andy Christianson on this topic we feel
> > that there is an opportunity to introduce an analytics framework that
> could
> > provide users reasonable predictions on key performance indicators for
> > flows, such as back pressure and flow rate, to help administrators
> improve
> > operational management of NiFi clusters.  This framework could offer
> > several key features:
> >
> >   - Provide a flexible internal analytics engine and model api which
> >   supports the addition of or enhancement to onboard models
> >   - Support integration of remote or cloud based ML models
> >   - Support both traditional and online (incremental) learning methods
> >   - Provide support for model caching  (perhaps later inclusion into a
> >   model repository or registry)
> >   - UI enhancements to display prediction information either in existing
> >   summary data, new data visualizations, or directly within the
> flow/canvas
> >   (where applicable)
> >
> > For an initial target we thought that back pressure prediction would be a
> > good starting point for this initiative, given that back pressure
> detection
> > is a key indicator of flow performance and many of the metrics currently
> > available would provide enough data points to create a reasonable
> > performing model.  We have some ideas on how this could be achieved
> however
> > we wanted to discuss this more with the community to get thoughts about
> > tackling this work, especially if there are specific use cases or other
> > factors that should be considered.
> >
> > Looking forward to everyone's thoughts and input.
> >
> > Thanks,
> >
> > -yolanda
> >
> > --
> > yolanda.m.davis@gmail.com
> > @YolandaMDavis
>
>

Re: [DISCUSS] Predictive Analytics for NiFi Metrics

Posted by Andy LoPresto <al...@apache.org>.
Yolanda, 

I think this sounds like a great idea and will be very useful to admins/users, as well as enabling some interesting next-level functionality and insight generation. Thanks for putting this out there. 

Andy LoPresto
alopresto@apache.org
alopresto.apache@gmail.com
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

> On Jul 30, 2019, at 5:55 AM, Yolanda Davis <yo...@gmail.com> wrote:
> 
> Hello Everyone,
> 
> I wanted to reach out to the community to discuss potentially enhancing
> NiFi to include predictive analytics that can help users assess and predict
> NiFi behavior and performance. Currently NiFi has lots of metrics available
> for areas including jvm and flow component usage (via component status) as
> well as provenance data which NiFi makes available either through the UI or
> reporting tasks (for consumption by other systems). Past discussions in the
> community cite users shipping this data to applications such as Prometheus,
> ELK stacks, or Ambari metrics for further analysis in order to
> capture/review performance issues, detect anomalies, and send alerts or
> notifications.  These systems are efficient in capturing and helping to
> analyze these metrics however it requires customization work and knowledge
> of NiFi operations to provide meaningful analytics within a flow context.
> 
> In speaking with Matt Burgess and Andy Christianson on this topic we feel
> that there is an opportunity to introduce an analytics framework that could
> provide users reasonable predictions on key performance indicators for
> flows, such as back pressure and flow rate, to help administrators improve
> operational management of NiFi clusters.  This framework could offer
> several key features:
> 
>   - Provide a flexible internal analytics engine and model api which
>   supports the addition of or enhancement to onboard models
>   - Support integration of remote or cloud based ML models
>   - Support both traditional and online (incremental) learning methods
>   - Provide support for model caching  (perhaps later inclusion into a
>   model repository or registry)
>   - UI enhancements to display prediction information either in existing
>   summary data, new data visualizations, or directly within the flow/canvas
>   (where applicable)
> 
> For an initial target we thought that back pressure prediction would be a
> good starting point for this initiative, given that back pressure detection
> is a key indicator of flow performance and many of the metrics currently
> available would provide enough data points to create a reasonable
> performing model.  We have some ideas on how this could be achieved however
> we wanted to discuss this more with the community to get thoughts about
> tackling this work, especially if there are specific use cases or other
> factors that should be considered.
> 
> Looking forward to everyone's thoughts and input.
> 
> Thanks,
> 
> -yolanda
> 
> --
> yolanda.m.davis@gmail.com
> @YolandaMDavis