You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nifi.apache.org by Erik Anderson <ea...@pobox.com> on 2019/06/28 13:10:28 UTC

NiFi kubectl for launching container jobs

I have heard about NiFi-Fn (B23 Kubernetes Operator for NiFi-Fn)

Has anyone built a NiFi kubectl processor and possibly a nice NiFi "remote jobs" base docker container that can be used to control a remote nifi processor/job that conforms to Apache NiFi input and output mechanisms (flow file format)?

I know we would need a way to marshal the NiFi flowfile format in and out of a container, but if we did we can launch remote Python processes that scale well via using cloud native mechanisms (DevOps).

We built a native Python 2.7/3.7 NiFi processor that allows you to quickly chain together Java and Python flows. This is powerful because most data infrastructure is in python, not Java, especially Geospatial data. Of course this wont scale because of the number of Python processors that can potentially run on a NiFi node, but it allows you to quickly get things working. 2 days and you can do some amazing things.

If I can now offload that Python processing, via Kubernetes kubectl, we can use automated DevOps scaling for some really large jobs. Possibly using a NiFi processor that wraps https://github.com/kubernetes-client/java

Why all this jazz?
Real Use Case: Geospatial data (GeoJSON, ESRI Shapefile, etc). It requires standard python "pip install blah-blah" packages to process it.

Thoughts? Please throw tomatoes at the idea. I welcome constructive and destructive criticism because that means people care.

Erik Anderson
Bloomberg


Re: NiFi kubectl for launching container jobs

Posted by Joe Witt <jo...@gmail.com>.
Erik

The pattern/concept described is definitely a thing and a powerful model.
The stateless-nifi construct is a key enabler of this combined with
seamless integration of traditional NiFi to it combined with the registry
combined with a powerful Kubernetes operator.

Thanks

On Fri, Jun 28, 2019 at 9:10 AM Erik Anderson <ea...@pobox.com> wrote:

> I have heard about NiFi-Fn (B23 Kubernetes Operator for NiFi-Fn)
>
> Has anyone built a NiFi kubectl processor and possibly a nice NiFi "remote
> jobs" base docker container that can be used to control a remote nifi
> processor/job that conforms to Apache NiFi input and output mechanisms
> (flow file format)?
>
> I know we would need a way to marshal the NiFi flowfile format in and out
> of a container, but if we did we can launch remote Python processes that
> scale well via using cloud native mechanisms (DevOps).
>
> We built a native Python 2.7/3.7 NiFi processor that allows you to quickly
> chain together Java and Python flows. This is powerful because most data
> infrastructure is in python, not Java, especially Geospatial data. Of
> course this wont scale because of the number of Python processors that can
> potentially run on a NiFi node, but it allows you to quickly get things
> working. 2 days and you can do some amazing things.
>
> If I can now offload that Python processing, via Kubernetes kubectl, we
> can use automated DevOps scaling for some really large jobs. Possibly using
> a NiFi processor that wraps https://github.com/kubernetes-client/java
>
> Why all this jazz?
> Real Use Case: Geospatial data (GeoJSON, ESRI Shapefile, etc). It requires
> standard python "pip install blah-blah" packages to process it.
>
> Thoughts? Please throw tomatoes at the idea. I welcome constructive and
> destructive criticism because that means people care.
>
> Erik Anderson
> Bloomberg
>
>