You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by Sameer Jain <Sa...@evalueserve.com> on 2013/01/18 06:51:53 UTC

Query: Hadoop's threat to Informatica

Hi,

I am trying to understand the different data analysis algorithms available in the market. Analyst opinion suggests that Informatica and Hadoop have the best offerings in this space.

However, I am not very clear as to how the two are different and how they compete, because Hadoop is being used by IBM etc. Since you appear to be a fairly seasoned expert in this domain, I would like to get your perspective on the following:

I would hugely appreciate any thoughts/insights around

•         The workings of Hadoop/Mapreduce

•         Informatica’s product offering

•         A comparison of which one of these is better

•         A view of can and/or is Hadoop in competition with Informatica.

Regards,
Sameer

Sameer Jain
________________________________
Research Lead
Evalueserve
Office: + 91 124 4621615
Mob: + 91 7827256066
Fax: + 91 124 406 3430
www.evalueserve.com<http://www.evalueserve.com/>


.

________________________________

The information in this e-mail is the property of Evalueserve and is confidential and privileged. It is intended solely for the addressee. Access to this email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken in reliance on it is prohibited and will be unlawful. If you receive this message in error, please notify the sender immediately and delete all copies of this message.

Re: Query: Hadoop's threat to Informatica

Posted by Jeff Bean <jw...@cloudera.com>.
Informatica's take on the question:

http://www.informatica.com/hadoop/

My take on the question:

Hadoop is definitely disruptive and there have been times where we've been
able to blow missed data pipeline SLAs out of the water using Hadoop where
tools like Informatica were not able to. But Informatica's take on metadata
management, mixed workloads, and governance are somewhat well taken. It's
not that this stuff isn't doable with Hadoop, but it's that the maturity of
enterprise tools like Informatica are a little farther along.

Jeff

On Thu, Jan 17, 2013 at 10:51 PM, Mohammad Tariq <do...@gmail.com> wrote:

> Hello Sameer,
>
>      Pl find my comments embedded below :
>
> Warm Regards,
> Tariq
> https://mtariq.jux.com/
> cloudfront.blogspot.com
>
>
> On Fri, Jan 18, 2013 at 11:21 AM, Sameer Jain <Sameer.Jain@evalueserve.com
> > wrote:
>
>>   Hi,
>>
>>
>>
>> I am trying to understand the different data analysis algorithms
>> available in the market. Analyst opinion suggests that Informatica and
>> Hadoop have the best offerings in this space.
>>
>>
>>
>> However, I am not very clear as to how the two are different and how they
>> compete, because Hadoop is being used by IBM etc. Since you appear to be a
>> fairly seasoned expert in this domain, I would like to get your perspective
>> on the following:
>>
>>
>>
>> I would hugely appreciate any thoughts/insights around
>>
>> ·         The workings of Hadoop/Mapreduce
>>
> >>Hadoop is an open source platform that allows
> us to store and process huge, really huge, amount
> of data over a network of machines(need not be
> very sophisticated). It has 2 layers viz : HDFS &
> MapReduce for storage & processing respectively.
>
>> ·         Informatica’s product offering
>>
> >>They can tell you better. This list is specific to
> Hadoop ecosystem.
>
>>  ·         A comparison of which one of these is better
>>
> >>Depends upon the particular use case. One size
> doesn't fit all.
>
>>  ·         A view of can and/or is Hadoop in competition with
>> Informatica.
>>
> >>I don't think so. Informatica is basically an ETL thing(if I
> am not wrong), while we leverage Hadoop's power to create
> ETL tools with the Help of different Hadoop sub projects.
> Though it is possible to use them together.
>
>>
>>
>> Regards,
>>
>> Sameer
>>
>>
>>
>> *Sameer Jain*
>>   ------------------------------
>>
>> Research Lead
>>
>> Evalueserve
>>
>> Office: + 91 124 4621615
>>
>> Mob: + 91 7827256066
>>
>> Fax: + 91 124 406 3430
>>
>> www.evalueserve.com
>>
>>
>>
>>
>>
>> .
>>
>> ------------------------------
>>
>> The information in this e-mail is the property of Evalueserve and is
>> confidential and privileged. It is intended solely for the addressee.
>> Access to this email by anyone else is unauthorized. If you are not the
>> intended recipient, any disclosure, copying, distribution or any action
>> taken in reliance on it is prohibited and will be unlawful. If you receive
>> this message in error, please notify the sender immediately and delete all
>> copies of this message.
>>
>
>

Re: Query: Hadoop's threat to Informatica

Posted by Jeff Bean <jw...@cloudera.com>.
Informatica's take on the question:

http://www.informatica.com/hadoop/

My take on the question:

Hadoop is definitely disruptive and there have been times where we've been
able to blow missed data pipeline SLAs out of the water using Hadoop where
tools like Informatica were not able to. But Informatica's take on metadata
management, mixed workloads, and governance are somewhat well taken. It's
not that this stuff isn't doable with Hadoop, but it's that the maturity of
enterprise tools like Informatica are a little farther along.

Jeff

On Thu, Jan 17, 2013 at 10:51 PM, Mohammad Tariq <do...@gmail.com> wrote:

> Hello Sameer,
>
>      Pl find my comments embedded below :
>
> Warm Regards,
> Tariq
> https://mtariq.jux.com/
> cloudfront.blogspot.com
>
>
> On Fri, Jan 18, 2013 at 11:21 AM, Sameer Jain <Sameer.Jain@evalueserve.com
> > wrote:
>
>>   Hi,
>>
>>
>>
>> I am trying to understand the different data analysis algorithms
>> available in the market. Analyst opinion suggests that Informatica and
>> Hadoop have the best offerings in this space.
>>
>>
>>
>> However, I am not very clear as to how the two are different and how they
>> compete, because Hadoop is being used by IBM etc. Since you appear to be a
>> fairly seasoned expert in this domain, I would like to get your perspective
>> on the following:
>>
>>
>>
>> I would hugely appreciate any thoughts/insights around
>>
>> ·         The workings of Hadoop/Mapreduce
>>
> >>Hadoop is an open source platform that allows
> us to store and process huge, really huge, amount
> of data over a network of machines(need not be
> very sophisticated). It has 2 layers viz : HDFS &
> MapReduce for storage & processing respectively.
>
>> ·         Informatica’s product offering
>>
> >>They can tell you better. This list is specific to
> Hadoop ecosystem.
>
>>  ·         A comparison of which one of these is better
>>
> >>Depends upon the particular use case. One size
> doesn't fit all.
>
>>  ·         A view of can and/or is Hadoop in competition with
>> Informatica.
>>
> >>I don't think so. Informatica is basically an ETL thing(if I
> am not wrong), while we leverage Hadoop's power to create
> ETL tools with the Help of different Hadoop sub projects.
> Though it is possible to use them together.
>
>>
>>
>> Regards,
>>
>> Sameer
>>
>>
>>
>> *Sameer Jain*
>>   ------------------------------
>>
>> Research Lead
>>
>> Evalueserve
>>
>> Office: + 91 124 4621615
>>
>> Mob: + 91 7827256066
>>
>> Fax: + 91 124 406 3430
>>
>> www.evalueserve.com
>>
>>
>>
>>
>>
>> .
>>
>> ------------------------------
>>
>> The information in this e-mail is the property of Evalueserve and is
>> confidential and privileged. It is intended solely for the addressee.
>> Access to this email by anyone else is unauthorized. If you are not the
>> intended recipient, any disclosure, copying, distribution or any action
>> taken in reliance on it is prohibited and will be unlawful. If you receive
>> this message in error, please notify the sender immediately and delete all
>> copies of this message.
>>
>
>

Re: Query: Hadoop's threat to Informatica

Posted by Jeff Bean <jw...@cloudera.com>.
Informatica's take on the question:

http://www.informatica.com/hadoop/

My take on the question:

Hadoop is definitely disruptive and there have been times where we've been
able to blow missed data pipeline SLAs out of the water using Hadoop where
tools like Informatica were not able to. But Informatica's take on metadata
management, mixed workloads, and governance are somewhat well taken. It's
not that this stuff isn't doable with Hadoop, but it's that the maturity of
enterprise tools like Informatica are a little farther along.

Jeff

On Thu, Jan 17, 2013 at 10:51 PM, Mohammad Tariq <do...@gmail.com> wrote:

> Hello Sameer,
>
>      Pl find my comments embedded below :
>
> Warm Regards,
> Tariq
> https://mtariq.jux.com/
> cloudfront.blogspot.com
>
>
> On Fri, Jan 18, 2013 at 11:21 AM, Sameer Jain <Sameer.Jain@evalueserve.com
> > wrote:
>
>>   Hi,
>>
>>
>>
>> I am trying to understand the different data analysis algorithms
>> available in the market. Analyst opinion suggests that Informatica and
>> Hadoop have the best offerings in this space.
>>
>>
>>
>> However, I am not very clear as to how the two are different and how they
>> compete, because Hadoop is being used by IBM etc. Since you appear to be a
>> fairly seasoned expert in this domain, I would like to get your perspective
>> on the following:
>>
>>
>>
>> I would hugely appreciate any thoughts/insights around
>>
>> ·         The workings of Hadoop/Mapreduce
>>
> >>Hadoop is an open source platform that allows
> us to store and process huge, really huge, amount
> of data over a network of machines(need not be
> very sophisticated). It has 2 layers viz : HDFS &
> MapReduce for storage & processing respectively.
>
>> ·         Informatica’s product offering
>>
> >>They can tell you better. This list is specific to
> Hadoop ecosystem.
>
>>  ·         A comparison of which one of these is better
>>
> >>Depends upon the particular use case. One size
> doesn't fit all.
>
>>  ·         A view of can and/or is Hadoop in competition with
>> Informatica.
>>
> >>I don't think so. Informatica is basically an ETL thing(if I
> am not wrong), while we leverage Hadoop's power to create
> ETL tools with the Help of different Hadoop sub projects.
> Though it is possible to use them together.
>
>>
>>
>> Regards,
>>
>> Sameer
>>
>>
>>
>> *Sameer Jain*
>>   ------------------------------
>>
>> Research Lead
>>
>> Evalueserve
>>
>> Office: + 91 124 4621615
>>
>> Mob: + 91 7827256066
>>
>> Fax: + 91 124 406 3430
>>
>> www.evalueserve.com
>>
>>
>>
>>
>>
>> .
>>
>> ------------------------------
>>
>> The information in this e-mail is the property of Evalueserve and is
>> confidential and privileged. It is intended solely for the addressee.
>> Access to this email by anyone else is unauthorized. If you are not the
>> intended recipient, any disclosure, copying, distribution or any action
>> taken in reliance on it is prohibited and will be unlawful. If you receive
>> this message in error, please notify the sender immediately and delete all
>> copies of this message.
>>
>
>

Re: Query: Hadoop's threat to Informatica

Posted by Jeff Bean <jw...@cloudera.com>.
Informatica's take on the question:

http://www.informatica.com/hadoop/

My take on the question:

Hadoop is definitely disruptive and there have been times where we've been
able to blow missed data pipeline SLAs out of the water using Hadoop where
tools like Informatica were not able to. But Informatica's take on metadata
management, mixed workloads, and governance are somewhat well taken. It's
not that this stuff isn't doable with Hadoop, but it's that the maturity of
enterprise tools like Informatica are a little farther along.

Jeff

On Thu, Jan 17, 2013 at 10:51 PM, Mohammad Tariq <do...@gmail.com> wrote:

> Hello Sameer,
>
>      Pl find my comments embedded below :
>
> Warm Regards,
> Tariq
> https://mtariq.jux.com/
> cloudfront.blogspot.com
>
>
> On Fri, Jan 18, 2013 at 11:21 AM, Sameer Jain <Sameer.Jain@evalueserve.com
> > wrote:
>
>>   Hi,
>>
>>
>>
>> I am trying to understand the different data analysis algorithms
>> available in the market. Analyst opinion suggests that Informatica and
>> Hadoop have the best offerings in this space.
>>
>>
>>
>> However, I am not very clear as to how the two are different and how they
>> compete, because Hadoop is being used by IBM etc. Since you appear to be a
>> fairly seasoned expert in this domain, I would like to get your perspective
>> on the following:
>>
>>
>>
>> I would hugely appreciate any thoughts/insights around
>>
>> ·         The workings of Hadoop/Mapreduce
>>
> >>Hadoop is an open source platform that allows
> us to store and process huge, really huge, amount
> of data over a network of machines(need not be
> very sophisticated). It has 2 layers viz : HDFS &
> MapReduce for storage & processing respectively.
>
>> ·         Informatica’s product offering
>>
> >>They can tell you better. This list is specific to
> Hadoop ecosystem.
>
>>  ·         A comparison of which one of these is better
>>
> >>Depends upon the particular use case. One size
> doesn't fit all.
>
>>  ·         A view of can and/or is Hadoop in competition with
>> Informatica.
>>
> >>I don't think so. Informatica is basically an ETL thing(if I
> am not wrong), while we leverage Hadoop's power to create
> ETL tools with the Help of different Hadoop sub projects.
> Though it is possible to use them together.
>
>>
>>
>> Regards,
>>
>> Sameer
>>
>>
>>
>> *Sameer Jain*
>>   ------------------------------
>>
>> Research Lead
>>
>> Evalueserve
>>
>> Office: + 91 124 4621615
>>
>> Mob: + 91 7827256066
>>
>> Fax: + 91 124 406 3430
>>
>> www.evalueserve.com
>>
>>
>>
>>
>>
>> .
>>
>> ------------------------------
>>
>> The information in this e-mail is the property of Evalueserve and is
>> confidential and privileged. It is intended solely for the addressee.
>> Access to this email by anyone else is unauthorized. If you are not the
>> intended recipient, any disclosure, copying, distribution or any action
>> taken in reliance on it is prohibited and will be unlawful. If you receive
>> this message in error, please notify the sender immediately and delete all
>> copies of this message.
>>
>
>

Re: Query: Hadoop's threat to Informatica

Posted by Mohammad Tariq <do...@gmail.com>.
Hello Sameer,

     Pl find my comments embedded below :

Warm Regards,
Tariq
https://mtariq.jux.com/
cloudfront.blogspot.com


On Fri, Jan 18, 2013 at 11:21 AM, Sameer Jain
<Sa...@evalueserve.com>wrote:

>   Hi,
>
>
>
> I am trying to understand the different data analysis algorithms available
> in the market. Analyst opinion suggests that Informatica and Hadoop have
> the best offerings in this space.
>
>
>
> However, I am not very clear as to how the two are different and how they
> compete, because Hadoop is being used by IBM etc. Since you appear to be a
> fairly seasoned expert in this domain, I would like to get your perspective
> on the following:
>
>
>
> I would hugely appreciate any thoughts/insights around
>
> ·         The workings of Hadoop/Mapreduce
>
>>Hadoop is an open source platform that allows
us to store and process huge, really huge, amount
of data over a network of machines(need not be
very sophisticated). It has 2 layers viz : HDFS &
MapReduce for storage & processing respectively.

> ·         Informatica’s product offering
>
>>They can tell you better. This list is specific to
Hadoop ecosystem.

>  ·         A comparison of which one of these is better
>
>>Depends upon the particular use case. One size
doesn't fit all.

>  ·         A view of can and/or is Hadoop in competition with Informatica.
>
>>I don't think so. Informatica is basically an ETL thing(if I
am not wrong), while we leverage Hadoop's power to create
ETL tools with the Help of different Hadoop sub projects.
Though it is possible to use them together.

>
>
> Regards,
>
> Sameer
>
>
>
> *Sameer Jain*
>   ------------------------------
>
> Research Lead
>
> Evalueserve
>
> Office: + 91 124 4621615
>
> Mob: + 91 7827256066
>
> Fax: + 91 124 406 3430
>
> www.evalueserve.com
>
>
>
>
>
> .
>
> ------------------------------
>
> The information in this e-mail is the property of Evalueserve and is
> confidential and privileged. It is intended solely for the addressee.
> Access to this email by anyone else is unauthorized. If you are not the
> intended recipient, any disclosure, copying, distribution or any action
> taken in reliance on it is prohibited and will be unlawful. If you receive
> this message in error, please notify the sender immediately and delete all
> copies of this message.
>

Re: Query: Hadoop's threat to Informatica

Posted by Mohammad Tariq <do...@gmail.com>.
Hello Sameer,

     Pl find my comments embedded below :

Warm Regards,
Tariq
https://mtariq.jux.com/
cloudfront.blogspot.com


On Fri, Jan 18, 2013 at 11:21 AM, Sameer Jain
<Sa...@evalueserve.com>wrote:

>   Hi,
>
>
>
> I am trying to understand the different data analysis algorithms available
> in the market. Analyst opinion suggests that Informatica and Hadoop have
> the best offerings in this space.
>
>
>
> However, I am not very clear as to how the two are different and how they
> compete, because Hadoop is being used by IBM etc. Since you appear to be a
> fairly seasoned expert in this domain, I would like to get your perspective
> on the following:
>
>
>
> I would hugely appreciate any thoughts/insights around
>
> ·         The workings of Hadoop/Mapreduce
>
>>Hadoop is an open source platform that allows
us to store and process huge, really huge, amount
of data over a network of machines(need not be
very sophisticated). It has 2 layers viz : HDFS &
MapReduce for storage & processing respectively.

> ·         Informatica’s product offering
>
>>They can tell you better. This list is specific to
Hadoop ecosystem.

>  ·         A comparison of which one of these is better
>
>>Depends upon the particular use case. One size
doesn't fit all.

>  ·         A view of can and/or is Hadoop in competition with Informatica.
>
>>I don't think so. Informatica is basically an ETL thing(if I
am not wrong), while we leverage Hadoop's power to create
ETL tools with the Help of different Hadoop sub projects.
Though it is possible to use them together.

>
>
> Regards,
>
> Sameer
>
>
>
> *Sameer Jain*
>   ------------------------------
>
> Research Lead
>
> Evalueserve
>
> Office: + 91 124 4621615
>
> Mob: + 91 7827256066
>
> Fax: + 91 124 406 3430
>
> www.evalueserve.com
>
>
>
>
>
> .
>
> ------------------------------
>
> The information in this e-mail is the property of Evalueserve and is
> confidential and privileged. It is intended solely for the addressee.
> Access to this email by anyone else is unauthorized. If you are not the
> intended recipient, any disclosure, copying, distribution or any action
> taken in reliance on it is prohibited and will be unlawful. If you receive
> this message in error, please notify the sender immediately and delete all
> copies of this message.
>

Re: Query: Hadoop's threat to Informatica

Posted by Mohammad Tariq <do...@gmail.com>.
Hello Sameer,

     Pl find my comments embedded below :

Warm Regards,
Tariq
https://mtariq.jux.com/
cloudfront.blogspot.com


On Fri, Jan 18, 2013 at 11:21 AM, Sameer Jain
<Sa...@evalueserve.com>wrote:

>   Hi,
>
>
>
> I am trying to understand the different data analysis algorithms available
> in the market. Analyst opinion suggests that Informatica and Hadoop have
> the best offerings in this space.
>
>
>
> However, I am not very clear as to how the two are different and how they
> compete, because Hadoop is being used by IBM etc. Since you appear to be a
> fairly seasoned expert in this domain, I would like to get your perspective
> on the following:
>
>
>
> I would hugely appreciate any thoughts/insights around
>
> ·         The workings of Hadoop/Mapreduce
>
>>Hadoop is an open source platform that allows
us to store and process huge, really huge, amount
of data over a network of machines(need not be
very sophisticated). It has 2 layers viz : HDFS &
MapReduce for storage & processing respectively.

> ·         Informatica’s product offering
>
>>They can tell you better. This list is specific to
Hadoop ecosystem.

>  ·         A comparison of which one of these is better
>
>>Depends upon the particular use case. One size
doesn't fit all.

>  ·         A view of can and/or is Hadoop in competition with Informatica.
>
>>I don't think so. Informatica is basically an ETL thing(if I
am not wrong), while we leverage Hadoop's power to create
ETL tools with the Help of different Hadoop sub projects.
Though it is possible to use them together.

>
>
> Regards,
>
> Sameer
>
>
>
> *Sameer Jain*
>   ------------------------------
>
> Research Lead
>
> Evalueserve
>
> Office: + 91 124 4621615
>
> Mob: + 91 7827256066
>
> Fax: + 91 124 406 3430
>
> www.evalueserve.com
>
>
>
>
>
> .
>
> ------------------------------
>
> The information in this e-mail is the property of Evalueserve and is
> confidential and privileged. It is intended solely for the addressee.
> Access to this email by anyone else is unauthorized. If you are not the
> intended recipient, any disclosure, copying, distribution or any action
> taken in reliance on it is prohibited and will be unlawful. If you receive
> this message in error, please notify the sender immediately and delete all
> copies of this message.
>

Re: Query: Hadoop's threat to Informatica

Posted by Mohammad Tariq <do...@gmail.com>.
Hello Sameer,

     Pl find my comments embedded below :

Warm Regards,
Tariq
https://mtariq.jux.com/
cloudfront.blogspot.com


On Fri, Jan 18, 2013 at 11:21 AM, Sameer Jain
<Sa...@evalueserve.com>wrote:

>   Hi,
>
>
>
> I am trying to understand the different data analysis algorithms available
> in the market. Analyst opinion suggests that Informatica and Hadoop have
> the best offerings in this space.
>
>
>
> However, I am not very clear as to how the two are different and how they
> compete, because Hadoop is being used by IBM etc. Since you appear to be a
> fairly seasoned expert in this domain, I would like to get your perspective
> on the following:
>
>
>
> I would hugely appreciate any thoughts/insights around
>
> ·         The workings of Hadoop/Mapreduce
>
>>Hadoop is an open source platform that allows
us to store and process huge, really huge, amount
of data over a network of machines(need not be
very sophisticated). It has 2 layers viz : HDFS &
MapReduce for storage & processing respectively.

> ·         Informatica’s product offering
>
>>They can tell you better. This list is specific to
Hadoop ecosystem.

>  ·         A comparison of which one of these is better
>
>>Depends upon the particular use case. One size
doesn't fit all.

>  ·         A view of can and/or is Hadoop in competition with Informatica.
>
>>I don't think so. Informatica is basically an ETL thing(if I
am not wrong), while we leverage Hadoop's power to create
ETL tools with the Help of different Hadoop sub projects.
Though it is possible to use them together.

>
>
> Regards,
>
> Sameer
>
>
>
> *Sameer Jain*
>   ------------------------------
>
> Research Lead
>
> Evalueserve
>
> Office: + 91 124 4621615
>
> Mob: + 91 7827256066
>
> Fax: + 91 124 406 3430
>
> www.evalueserve.com
>
>
>
>
>
> .
>
> ------------------------------
>
> The information in this e-mail is the property of Evalueserve and is
> confidential and privileged. It is intended solely for the addressee.
> Access to this email by anyone else is unauthorized. If you are not the
> intended recipient, any disclosure, copying, distribution or any action
> taken in reliance on it is prohibited and will be unlawful. If you receive
> this message in error, please notify the sender immediately and delete all
> copies of this message.
>