You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nifi.apache.org by Rajneesh Shukla <ra...@gmail.com> on 2020/04/06 15:02:24 UTC

Apache NiFi tool evaluation related queries

Hello All,

I need below information about  Apache NiFi tool for data integration and
ETL needs:

Development effort:

The development effort , time and  complexity is more in general?

Maintainability:

Is it less maintainable?

Error Handling:

Only possesses  a single log file? or possesses a log and  error port in
every transform?
What kind of errors can be handled?

Various teams needed:

Separate Administration team or Unix or NT Admin will suffice needed works.
hence it does not need a dedicated administer?

File Structure:

Only able to  read record with single type of delimiter?

Data Integration Capability:

ODI boasts comparatively lesser range of  Data Integration Products and
capability which includes many related functions such as profiling and data
quality ? Also, if it offers these capabilities then these are  more
mainstream in nature?

Market Segments:

Serves medium to large scale companies?

Debugging:

Is it offer easy debugging? Example -just place some  watchers on required
places  and intermediate data will be  saved in temporary files for easy
viewing. or complex debugging process through debugger?

Company Strategy:

You can download a scaled down free version of their software and plenty of
free documents available on internet?

Go live rate:

High “GO Live” success? any know issue during deployment?


Scalability:
Is there any issue with stability? If yes then why is the issue and what is
impact?
Which kind of scalability is supported- horizontal, vertical?

Performance:
Can it supports High volume of data movement, transformation and
integration (ETL operations)?
How about parallelism - mapping level parallelism, session level
parallelism, supports multiple parallel source and multiple target data
loads?

Heterogeneous system:
It integrates data from various heterogeneous systems like multiple variety
of databases (SQL server, Oracle, DB2 etc), files (XML, XLS, CSV, text etc)?
Targets can be any type of DB , file etc.?

Big Data support:
It can be integrated and used for Big Data?

On cloud solution:
It is available for both- on cloud and on premises platforms?

Pricing:
Is it free ware - open source? Does it come in basic, standard and
enterprise editions flavors? If yes , all flavors are free?

Repository:
Does it offers repositories ? Those repositories are for metadata?
Host for repositires should be relational database?

Push down mechanism:
Do we have pushdown optimization concepts, where it can generate SQL
statements from the   workflow/mapping which   can be directly executed on
  database?
It is ETL or ELT tool?

Job scheduling:
Does it come with in-built scheduler?

Version controlling:

Does it offer version controlling?
If yes then it is tightly controlled or moderate?

Tool Bugs:
Any known tool bugs? Any issue due to those bugs?


Anything else you want to highlight?


Thanks,

Rajneesh

Re: Apache NiFi tool evaluation related queries

Posted by Joe Witt <jo...@apache.org>.
Redirecting to dev@nifi.apache.org

1. As I mentioned earlier please subscribe so your notes dont need
moderation.
2. Please dont email people directly.  We're trying to be helpful but not
on a personal basis (through the mailing list).
3. Please conduct basic research on your own and come back with specific
questions.

Built-in Scheduler: Yes please read about it.  Cron scheduling sounds
important for you.
Heterogeneous sources/targets: Yes please read about it.

Thanks
Joe

On Mon, Apr 6, 2020 at 3:50 PM Rajneesh Shukla <ra...@gmail.com>
wrote:

> Thanks Joe, one more question please...
>
> Does it offer in built scheduler to schedule batch load like once in a
> week at given start time?
> Does it offer data integration from heterogeneous source like various
> types of db- sql server, oracle, sfdc, db2 etc and various types of files
> like xml, test, excel, json, xsd, csv etc. Target can be any thing?
>
> Thanks,
> Rajneesh
>
> On Mon, 6 Apr 2020 at 23:59, Joe Witt <jo...@gmail.com> wrote:
>
>> Apache NiFi is a project of the Apache Software Foundation and its source
>> code is made available under the Apache License verison 2.0
>> https://www.apache.org/licenses/LICENSE-2.0.
>>
>> Best wishes
>> Joe
>>
>> On Mon, Apr 6, 2020 at 2:21 PM Rajneesh Shukla <ra...@gmail.com>
>> wrote:
>>
>>> Thanks Joe,
>>>
>>> May I know if Apache NiFi is still available as open source/ free ware
>>> ETL tool?
>>>
>>> Thanks,
>>> Rajneesh
>>>
>>> On Mon, 6 Apr 2020 at 20:41, Joe Witt <jo...@gmail.com> wrote:
>>>
>>>> Added your email as bcc.  You need to subscribe to get your messages
>>>> through without moderation and to see responses emailed to you.
>>>>
>>>> Rajneesh
>>>>
>>>> These questions are beyond high level and vague and really not
>>>> appropriate for the community to even meaningfully respond.  You'll need to
>>>> review the available documentation and if there is something you have a
>>>> question on please ask.  For the the vendor/market related questions you
>>>> should reach out to a vendor or review all the publicly available material
>>>> on-line. This looks like a pretty stock RFP/contract/software acquisition
>>>> doc so you should be able to get these answers on your own within minutes.
>>>>
>>>> Thanks
>>>>
>>>> On Mon, Apr 6, 2020 at 11:05 AM Rajneesh Shukla <
>>>> rajneeshshukla@gmail.com> wrote:
>>>>
>>>>> Hello All,
>>>>>
>>>>> I need below information about  Apache NiFi tool for data integration
>>>>> and
>>>>> ETL needs:
>>>>>
>>>>> Development effort:
>>>>>
>>>>> The development effort , time and  complexity is more in general?
>>>>>
>>>>> Maintainability:
>>>>>
>>>>> Is it less maintainable?
>>>>>
>>>>> Error Handling:
>>>>>
>>>>> Only possesses  a single log file? or possesses a log and  error port
>>>>> in
>>>>> every transform?
>>>>> What kind of errors can be handled?
>>>>>
>>>>> Various teams needed:
>>>>>
>>>>> Separate Administration team or Unix or NT Admin will suffice needed
>>>>> works.
>>>>> hence it does not need a dedicated administer?
>>>>>
>>>>> File Structure:
>>>>>
>>>>> Only able to  read record with single type of delimiter?
>>>>>
>>>>> Data Integration Capability:
>>>>>
>>>>> ODI boasts comparatively lesser range of  Data Integration Products and
>>>>> capability which includes many related functions such as profiling and
>>>>> data
>>>>> quality ? Also, if it offers these capabilities then these are  more
>>>>> mainstream in nature?
>>>>>
>>>>> Market Segments:
>>>>>
>>>>> Serves medium to large scale companies?
>>>>>
>>>>> Debugging:
>>>>>
>>>>> Is it offer easy debugging? Example -just place some  watchers on
>>>>> required
>>>>> places  and intermediate data will be  saved in temporary files for
>>>>> easy
>>>>> viewing. or complex debugging process through debugger?
>>>>>
>>>>> Company Strategy:
>>>>>
>>>>> You can download a scaled down free version of their software and
>>>>> plenty of
>>>>> free documents available on internet?
>>>>>
>>>>> Go live rate:
>>>>>
>>>>> High “GO Live” success? any know issue during deployment?
>>>>>
>>>>>
>>>>> Scalability:
>>>>> Is there any issue with stability? If yes then why is the issue and
>>>>> what is
>>>>> impact?
>>>>> Which kind of scalability is supported- horizontal, vertical?
>>>>>
>>>>> Performance:
>>>>> Can it supports High volume of data movement, transformation and
>>>>> integration (ETL operations)?
>>>>> How about parallelism - mapping level parallelism, session level
>>>>> parallelism, supports multiple parallel source and multiple target data
>>>>> loads?
>>>>>
>>>>> Heterogeneous system:
>>>>> It integrates data from various heterogeneous systems like multiple
>>>>> variety
>>>>> of databases (SQL server, Oracle, DB2 etc), files (XML, XLS, CSV, text
>>>>> etc)?
>>>>> Targets can be any type of DB , file etc.?
>>>>>
>>>>> Big Data support:
>>>>> It can be integrated and used for Big Data?
>>>>>
>>>>> On cloud solution:
>>>>> It is available for both- on cloud and on premises platforms?
>>>>>
>>>>> Pricing:
>>>>> Is it free ware - open source? Does it come in basic, standard and
>>>>> enterprise editions flavors? If yes , all flavors are free?
>>>>>
>>>>> Repository:
>>>>> Does it offers repositories ? Those repositories are for metadata?
>>>>> Host for repositires should be relational database?
>>>>>
>>>>> Push down mechanism:
>>>>> Do we have pushdown optimization concepts, where it can generate SQL
>>>>> statements from the   workflow/mapping which   can be directly
>>>>> executed on
>>>>>   database?
>>>>> It is ETL or ELT tool?
>>>>>
>>>>> Job scheduling:
>>>>> Does it come with in-built scheduler?
>>>>>
>>>>> Version controlling:
>>>>>
>>>>> Does it offer version controlling?
>>>>> If yes then it is tightly controlled or moderate?
>>>>>
>>>>> Tool Bugs:
>>>>> Any known tool bugs? Any issue due to those bugs?
>>>>>
>>>>>
>>>>> Anything else you want to highlight?
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Rajneesh
>>>>>
>>>>

Re: Apache NiFi tool evaluation related queries

Posted by Joe Witt <jo...@gmail.com>.
Apache NiFi is a project of the Apache Software Foundation and its source
code is made available under the Apache License verison 2.0
https://www.apache.org/licenses/LICENSE-2.0.

Best wishes
Joe

On Mon, Apr 6, 2020 at 2:21 PM Rajneesh Shukla <ra...@gmail.com>
wrote:

> Thanks Joe,
>
> May I know if Apache NiFi is still available as open source/ free ware ETL
> tool?
>
> Thanks,
> Rajneesh
>
> On Mon, 6 Apr 2020 at 20:41, Joe Witt <jo...@gmail.com> wrote:
>
>> Added your email as bcc.  You need to subscribe to get your messages
>> through without moderation and to see responses emailed to you.
>>
>> Rajneesh
>>
>> These questions are beyond high level and vague and really not
>> appropriate for the community to even meaningfully respond.  You'll need to
>> review the available documentation and if there is something you have a
>> question on please ask.  For the the vendor/market related questions you
>> should reach out to a vendor or review all the publicly available material
>> on-line. This looks like a pretty stock RFP/contract/software acquisition
>> doc so you should be able to get these answers on your own within minutes.
>>
>> Thanks
>>
>> On Mon, Apr 6, 2020 at 11:05 AM Rajneesh Shukla <ra...@gmail.com>
>> wrote:
>>
>>> Hello All,
>>>
>>> I need below information about  Apache NiFi tool for data integration and
>>> ETL needs:
>>>
>>> Development effort:
>>>
>>> The development effort , time and  complexity is more in general?
>>>
>>> Maintainability:
>>>
>>> Is it less maintainable?
>>>
>>> Error Handling:
>>>
>>> Only possesses  a single log file? or possesses a log and  error port in
>>> every transform?
>>> What kind of errors can be handled?
>>>
>>> Various teams needed:
>>>
>>> Separate Administration team or Unix or NT Admin will suffice needed
>>> works.
>>> hence it does not need a dedicated administer?
>>>
>>> File Structure:
>>>
>>> Only able to  read record with single type of delimiter?
>>>
>>> Data Integration Capability:
>>>
>>> ODI boasts comparatively lesser range of  Data Integration Products and
>>> capability which includes many related functions such as profiling and
>>> data
>>> quality ? Also, if it offers these capabilities then these are  more
>>> mainstream in nature?
>>>
>>> Market Segments:
>>>
>>> Serves medium to large scale companies?
>>>
>>> Debugging:
>>>
>>> Is it offer easy debugging? Example -just place some  watchers on
>>> required
>>> places  and intermediate data will be  saved in temporary files for easy
>>> viewing. or complex debugging process through debugger?
>>>
>>> Company Strategy:
>>>
>>> You can download a scaled down free version of their software and plenty
>>> of
>>> free documents available on internet?
>>>
>>> Go live rate:
>>>
>>> High “GO Live” success? any know issue during deployment?
>>>
>>>
>>> Scalability:
>>> Is there any issue with stability? If yes then why is the issue and what
>>> is
>>> impact?
>>> Which kind of scalability is supported- horizontal, vertical?
>>>
>>> Performance:
>>> Can it supports High volume of data movement, transformation and
>>> integration (ETL operations)?
>>> How about parallelism - mapping level parallelism, session level
>>> parallelism, supports multiple parallel source and multiple target data
>>> loads?
>>>
>>> Heterogeneous system:
>>> It integrates data from various heterogeneous systems like multiple
>>> variety
>>> of databases (SQL server, Oracle, DB2 etc), files (XML, XLS, CSV, text
>>> etc)?
>>> Targets can be any type of DB , file etc.?
>>>
>>> Big Data support:
>>> It can be integrated and used for Big Data?
>>>
>>> On cloud solution:
>>> It is available for both- on cloud and on premises platforms?
>>>
>>> Pricing:
>>> Is it free ware - open source? Does it come in basic, standard and
>>> enterprise editions flavors? If yes , all flavors are free?
>>>
>>> Repository:
>>> Does it offers repositories ? Those repositories are for metadata?
>>> Host for repositires should be relational database?
>>>
>>> Push down mechanism:
>>> Do we have pushdown optimization concepts, where it can generate SQL
>>> statements from the   workflow/mapping which   can be directly executed
>>> on
>>>   database?
>>> It is ETL or ELT tool?
>>>
>>> Job scheduling:
>>> Does it come with in-built scheduler?
>>>
>>> Version controlling:
>>>
>>> Does it offer version controlling?
>>> If yes then it is tightly controlled or moderate?
>>>
>>> Tool Bugs:
>>> Any known tool bugs? Any issue due to those bugs?
>>>
>>>
>>> Anything else you want to highlight?
>>>
>>>
>>> Thanks,
>>>
>>> Rajneesh
>>>
>>

Re: Apache NiFi tool evaluation related queries

Posted by Rajneesh Shukla <ra...@gmail.com>.
Thanks Joe,

May I know if Apache NiFi is still available as open source/ free ware ETL
tool?

Thanks,
Rajneesh

On Mon, 6 Apr 2020 at 20:41, Joe Witt <jo...@gmail.com> wrote:

> Added your email as bcc.  You need to subscribe to get your messages
> through without moderation and to see responses emailed to you.
>
> Rajneesh
>
> These questions are beyond high level and vague and really not
> appropriate for the community to even meaningfully respond.  You'll need to
> review the available documentation and if there is something you have a
> question on please ask.  For the the vendor/market related questions you
> should reach out to a vendor or review all the publicly available material
> on-line. This looks like a pretty stock RFP/contract/software acquisition
> doc so you should be able to get these answers on your own within minutes.
>
> Thanks
>
> On Mon, Apr 6, 2020 at 11:05 AM Rajneesh Shukla <ra...@gmail.com>
> wrote:
>
>> Hello All,
>>
>> I need below information about  Apache NiFi tool for data integration and
>> ETL needs:
>>
>> Development effort:
>>
>> The development effort , time and  complexity is more in general?
>>
>> Maintainability:
>>
>> Is it less maintainable?
>>
>> Error Handling:
>>
>> Only possesses  a single log file? or possesses a log and  error port in
>> every transform?
>> What kind of errors can be handled?
>>
>> Various teams needed:
>>
>> Separate Administration team or Unix or NT Admin will suffice needed
>> works.
>> hence it does not need a dedicated administer?
>>
>> File Structure:
>>
>> Only able to  read record with single type of delimiter?
>>
>> Data Integration Capability:
>>
>> ODI boasts comparatively lesser range of  Data Integration Products and
>> capability which includes many related functions such as profiling and
>> data
>> quality ? Also, if it offers these capabilities then these are  more
>> mainstream in nature?
>>
>> Market Segments:
>>
>> Serves medium to large scale companies?
>>
>> Debugging:
>>
>> Is it offer easy debugging? Example -just place some  watchers on required
>> places  and intermediate data will be  saved in temporary files for easy
>> viewing. or complex debugging process through debugger?
>>
>> Company Strategy:
>>
>> You can download a scaled down free version of their software and plenty
>> of
>> free documents available on internet?
>>
>> Go live rate:
>>
>> High “GO Live” success? any know issue during deployment?
>>
>>
>> Scalability:
>> Is there any issue with stability? If yes then why is the issue and what
>> is
>> impact?
>> Which kind of scalability is supported- horizontal, vertical?
>>
>> Performance:
>> Can it supports High volume of data movement, transformation and
>> integration (ETL operations)?
>> How about parallelism - mapping level parallelism, session level
>> parallelism, supports multiple parallel source and multiple target data
>> loads?
>>
>> Heterogeneous system:
>> It integrates data from various heterogeneous systems like multiple
>> variety
>> of databases (SQL server, Oracle, DB2 etc), files (XML, XLS, CSV, text
>> etc)?
>> Targets can be any type of DB , file etc.?
>>
>> Big Data support:
>> It can be integrated and used for Big Data?
>>
>> On cloud solution:
>> It is available for both- on cloud and on premises platforms?
>>
>> Pricing:
>> Is it free ware - open source? Does it come in basic, standard and
>> enterprise editions flavors? If yes , all flavors are free?
>>
>> Repository:
>> Does it offers repositories ? Those repositories are for metadata?
>> Host for repositires should be relational database?
>>
>> Push down mechanism:
>> Do we have pushdown optimization concepts, where it can generate SQL
>> statements from the   workflow/mapping which   can be directly executed on
>>   database?
>> It is ETL or ELT tool?
>>
>> Job scheduling:
>> Does it come with in-built scheduler?
>>
>> Version controlling:
>>
>> Does it offer version controlling?
>> If yes then it is tightly controlled or moderate?
>>
>> Tool Bugs:
>> Any known tool bugs? Any issue due to those bugs?
>>
>>
>> Anything else you want to highlight?
>>
>>
>> Thanks,
>>
>> Rajneesh
>>
>

Re: Apache NiFi tool evaluation related queries

Posted by Joe Witt <jo...@gmail.com>.
Added your email as bcc.  You need to subscribe to get your messages
through without moderation and to see responses emailed to you.

Rajneesh

These questions are beyond high level and vague and really not
appropriate for the community to even meaningfully respond.  You'll need to
review the available documentation and if there is something you have a
question on please ask.  For the the vendor/market related questions you
should reach out to a vendor or review all the publicly available material
on-line. This looks like a pretty stock RFP/contract/software acquisition
doc so you should be able to get these answers on your own within minutes.

Thanks

On Mon, Apr 6, 2020 at 11:05 AM Rajneesh Shukla <ra...@gmail.com>
wrote:

> Hello All,
>
> I need below information about  Apache NiFi tool for data integration and
> ETL needs:
>
> Development effort:
>
> The development effort , time and  complexity is more in general?
>
> Maintainability:
>
> Is it less maintainable?
>
> Error Handling:
>
> Only possesses  a single log file? or possesses a log and  error port in
> every transform?
> What kind of errors can be handled?
>
> Various teams needed:
>
> Separate Administration team or Unix or NT Admin will suffice needed works.
> hence it does not need a dedicated administer?
>
> File Structure:
>
> Only able to  read record with single type of delimiter?
>
> Data Integration Capability:
>
> ODI boasts comparatively lesser range of  Data Integration Products and
> capability which includes many related functions such as profiling and data
> quality ? Also, if it offers these capabilities then these are  more
> mainstream in nature?
>
> Market Segments:
>
> Serves medium to large scale companies?
>
> Debugging:
>
> Is it offer easy debugging? Example -just place some  watchers on required
> places  and intermediate data will be  saved in temporary files for easy
> viewing. or complex debugging process through debugger?
>
> Company Strategy:
>
> You can download a scaled down free version of their software and plenty of
> free documents available on internet?
>
> Go live rate:
>
> High “GO Live” success? any know issue during deployment?
>
>
> Scalability:
> Is there any issue with stability? If yes then why is the issue and what is
> impact?
> Which kind of scalability is supported- horizontal, vertical?
>
> Performance:
> Can it supports High volume of data movement, transformation and
> integration (ETL operations)?
> How about parallelism - mapping level parallelism, session level
> parallelism, supports multiple parallel source and multiple target data
> loads?
>
> Heterogeneous system:
> It integrates data from various heterogeneous systems like multiple variety
> of databases (SQL server, Oracle, DB2 etc), files (XML, XLS, CSV, text
> etc)?
> Targets can be any type of DB , file etc.?
>
> Big Data support:
> It can be integrated and used for Big Data?
>
> On cloud solution:
> It is available for both- on cloud and on premises platforms?
>
> Pricing:
> Is it free ware - open source? Does it come in basic, standard and
> enterprise editions flavors? If yes , all flavors are free?
>
> Repository:
> Does it offers repositories ? Those repositories are for metadata?
> Host for repositires should be relational database?
>
> Push down mechanism:
> Do we have pushdown optimization concepts, where it can generate SQL
> statements from the   workflow/mapping which   can be directly executed on
>   database?
> It is ETL or ELT tool?
>
> Job scheduling:
> Does it come with in-built scheduler?
>
> Version controlling:
>
> Does it offer version controlling?
> If yes then it is tightly controlled or moderate?
>
> Tool Bugs:
> Any known tool bugs? Any issue due to those bugs?
>
>
> Anything else you want to highlight?
>
>
> Thanks,
>
> Rajneesh
>