You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by Márcio Sugar <fa...@ymail.com> on 2019/07/27 04:27:00 UTC

Running NiFi on Google Cloud

Hi,
Please, is there any tutorial, guide or set of best practices that help with installing and using NiFi on Google Cloud (or any cloud provider, for that matter)? 
Thank you,

Marcio

Re: Running NiFi on Google Cloud

Posted by Márcio Sugar <fa...@ymail.com>.
Hi Dano,

Thanks for your recommendation. I'll surely keep that in mind.

From your answer, I infer at least some of your data processing uses NiFi as the choreographer. In my case, we use NiFi just to move data around, so it performs a more limited role.

To give you some context: My goal is to recreate our on-prem Data Warehouse in the cloud, preferably using managed services. 

We're currently in the early stages of our migration, still deciding on how to make data from our systems of record accessible in Google Cloud. The sources include relational databases, file extracts, and REST APIs. I've decided to start with the batch-oriented stuff, but the ultimate goal is to do data streaming processing. 

Currently, I'm running NiFi on-prem to copy JSON and CSV files to GCS, and also publish data retrieved from databases to Cloud Pub/Sub topics. Cloud Functions then trigger the execution of Dataflow pipelines (sometimes controlled by Airflow) in response, and the resulting validated, enriched data are stored in BigQuery. 

My NiFi flows on-prem usually start with ListFile, FetchFile, QueryDatabaseTable, ConsumeKafka and end with a PutGCSObject or PublishGCPPubSub processor. (Before, the flows were doing a lot more, from format conversions to custom-made data processing, but I'm now trying to let most of the hard work for Dataflow.) I intend to keep NiFi performing a similar role after I move the cluster over to the cloud.

Suggestions are always welcome. I find it frustrating sometimes to try to acquire all the necessary knowledge by myself. It seems to be very tribal. 

Thanks again,

Marcio



On Sunday, July 28, 2019, 10:17:01 a.m. EDT, dan young <da...@gmail.com> wrote: 

Hello Márcio,

We've been running NiFi clusters for almost 3 years now at Looker on AWS. We will be moving these over to GCP in the future. My main recommendation is to ensure that you're using something like Ansible to help with the deployment and configuration of the cluster. We use a lot of execute stream command processors to run a variety of node workloads. 
 
Other than that, a lot will be specific to your use case and mileage will vary.... 

Regards

Dano


On Fri, Jul 26, 2019, 10:27 PM Márcio Sugar <fa...@ymail.com> wrote:
> Hi,
> 
> Please, is there any tutorial, guide or set of best practices that help with installing and using NiFi on Google Cloud (or any cloud provider, for that matter)? 
> 
> Thank you,
> 
> Marcio
> 

Re: Running NiFi on Google Cloud

Posted by dan young <da...@gmail.com>.
Hello Márcio,

We've been running NiFi clusters for almost 3 years now at Looker on AWS.
We will be moving these over to GCP in the future. My main recommendation
is to ensure that you're using something like Ansible to help with the
deployment and configuration of the cluster. We use a lot of execute stream
command processors to run a variety of node workloads.

Other than that, a lot will be specific to your use case and mileage will
vary....

Regards

Dano


On Fri, Jul 26, 2019, 10:27 PM Márcio Sugar <fa...@ymail.com> wrote:

> Hi,
>
> Please, is there any tutorial, guide or set of best practices that help
> with installing and using NiFi on Google Cloud (or any cloud provider, for
> that matter)?
>
> Thank you,
>
> Marcio
>
>