You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by 23...@gmail.com on 2015/09/07 21:59:26 UTC

ETL/DW to Hadoop migrations

Hi guys, 



I am looking for pointers on migrating existing data warehouse to Hadoop.  Currently,  we are using IBM Data stage an ETL tool and loading into Teradata staging/maintain tables.  Please suggest an architecture which reduces cost without much degrade in performance.  Has anyone of you been a part of such migration before? If yes then please provide some inputs,  especially on what aspects should we be taking care of.  Talking about source data,  it is mainly in the form of flat files and database. 


Thanks in advance. 


Regards, 


Abhishek Singh

Re: ETL/DW to Hadoop migrations

Posted by Abhishek Singh <23...@gmail.com>.
Thank you Partho, Nagaraj, Sandesh and Kishore. Thanks for your insights.

On Wed, Sep 9, 2015 at 7:25 AM, Nagaraj Chandrashekar <
nchandrashekar@innominds.com> wrote:

> Hello Abhishek,
>
> I think you may find this white paper useful.  This document talks about
> offloading Teradata with Hadoop. It also talks about capacity and savings
> costs using Hadoop solutions.
>
>
> http://dataconomy.com/wp-content/uploads/2014/06/Syncsort-A-123ApproachtoTeradataOffloadwithHadoop.pdf
>
>
> *Cheers*
> *Nagaraj C*
>
> *Learn And Share! It’s Big Data.*
>
>
> From: Sandesh Hegde <sa...@gmail.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Date: Wednesday, September 9, 2015 at 1:24 AM
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Subject: Re: ETL/DW to Hadoop migrations
>
> Hello Abhishek,
>
> Below is a link to Free data ingestion tool, dtIngest, this runs on Hadoop
> as Yarn app. Support various data sources.
> Currently it doesn't have a support for Databases, future versions may
> have it. For database you can try Apache Sqoop.
>
> https://www.datatorrent.com/product/datatorrent-dtingest/
>
> https://www.datatorrent.com/dtingest-unified-streaming-batch-data-ingestion-hadoop/
>
> Thanks
> Sandesh
>
> PS: I work for DataTorrent.
>
> On Tue, Sep 8, 2015 at 9:57 AM, Abhishek Singh <23...@gmail.com>
> wrote:
>
>> Hi Kishore,
>>
>> Thanks for reverting. We are planning to do a POC in such a manner that
>> we can replace Datastage. Datastage and Teradata are costly tools which is
>> making a big hole in pocket. So, have you come across anything where ETL
>> pipeline could be replaced with Hadoop? I understand about connectors which
>> you are saying, but how about replacing an ETL tool?
>>
>> Any links would do more than good.
>>
>> Thanks once again.
>>
>> Abhishek
>>
>> On Tue, Sep 8, 2015 at 9:28 AM, Krishna Kishore Bonagiri <
>> write2kishore@gmail.com> wrote:
>>
>>> Abhishek,
>>>
>>>    Are you looking for loading your data into Hadoop? if yes, IBM
>>> DataStage has a stage called BDFS that loads/writes your data into Hadoop.
>>>
>>> Thanks,
>>> Kishore
>>>
>>> On Tue, Sep 8, 2015 at 1:29 AM, <23...@gmail.com> wrote:
>>>
>>>> Hi guys,
>>>>
>>>> I am looking for pointers on migrating existing data warehouse to
>>>> Hadoop.  Currently,  we are using IBM Data stage an ETL tool and loading
>>>> into Teradata staging/maintain tables.  Please suggest an architecture
>>>> which reduces cost without much degrade in performance.  Has anyone of you
>>>> been a part of such migration before? If yes then please provide some
>>>> inputs,  especially on what aspects should we be taking care of.  Talking
>>>> about source data,  it is mainly in the form of flat files and database.
>>>>
>>>> Thanks in advance.
>>>>
>>>> Regards,
>>>>
>>>> Abhishek Singh
>>>>
>>>
>>>
>>
>

Re: ETL/DW to Hadoop migrations

Posted by Abhishek Singh <23...@gmail.com>.
Thank you Partho, Nagaraj, Sandesh and Kishore. Thanks for your insights.

On Wed, Sep 9, 2015 at 7:25 AM, Nagaraj Chandrashekar <
nchandrashekar@innominds.com> wrote:

> Hello Abhishek,
>
> I think you may find this white paper useful.  This document talks about
> offloading Teradata with Hadoop. It also talks about capacity and savings
> costs using Hadoop solutions.
>
>
> http://dataconomy.com/wp-content/uploads/2014/06/Syncsort-A-123ApproachtoTeradataOffloadwithHadoop.pdf
>
>
> *Cheers*
> *Nagaraj C*
>
> *Learn And Share! It’s Big Data.*
>
>
> From: Sandesh Hegde <sa...@gmail.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Date: Wednesday, September 9, 2015 at 1:24 AM
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Subject: Re: ETL/DW to Hadoop migrations
>
> Hello Abhishek,
>
> Below is a link to Free data ingestion tool, dtIngest, this runs on Hadoop
> as Yarn app. Support various data sources.
> Currently it doesn't have a support for Databases, future versions may
> have it. For database you can try Apache Sqoop.
>
> https://www.datatorrent.com/product/datatorrent-dtingest/
>
> https://www.datatorrent.com/dtingest-unified-streaming-batch-data-ingestion-hadoop/
>
> Thanks
> Sandesh
>
> PS: I work for DataTorrent.
>
> On Tue, Sep 8, 2015 at 9:57 AM, Abhishek Singh <23...@gmail.com>
> wrote:
>
>> Hi Kishore,
>>
>> Thanks for reverting. We are planning to do a POC in such a manner that
>> we can replace Datastage. Datastage and Teradata are costly tools which is
>> making a big hole in pocket. So, have you come across anything where ETL
>> pipeline could be replaced with Hadoop? I understand about connectors which
>> you are saying, but how about replacing an ETL tool?
>>
>> Any links would do more than good.
>>
>> Thanks once again.
>>
>> Abhishek
>>
>> On Tue, Sep 8, 2015 at 9:28 AM, Krishna Kishore Bonagiri <
>> write2kishore@gmail.com> wrote:
>>
>>> Abhishek,
>>>
>>>    Are you looking for loading your data into Hadoop? if yes, IBM
>>> DataStage has a stage called BDFS that loads/writes your data into Hadoop.
>>>
>>> Thanks,
>>> Kishore
>>>
>>> On Tue, Sep 8, 2015 at 1:29 AM, <23...@gmail.com> wrote:
>>>
>>>> Hi guys,
>>>>
>>>> I am looking for pointers on migrating existing data warehouse to
>>>> Hadoop.  Currently,  we are using IBM Data stage an ETL tool and loading
>>>> into Teradata staging/maintain tables.  Please suggest an architecture
>>>> which reduces cost without much degrade in performance.  Has anyone of you
>>>> been a part of such migration before? If yes then please provide some
>>>> inputs,  especially on what aspects should we be taking care of.  Talking
>>>> about source data,  it is mainly in the form of flat files and database.
>>>>
>>>> Thanks in advance.
>>>>
>>>> Regards,
>>>>
>>>> Abhishek Singh
>>>>
>>>
>>>
>>
>

Re: ETL/DW to Hadoop migrations

Posted by Abhishek Singh <23...@gmail.com>.
Thank you Partho, Nagaraj, Sandesh and Kishore. Thanks for your insights.

On Wed, Sep 9, 2015 at 7:25 AM, Nagaraj Chandrashekar <
nchandrashekar@innominds.com> wrote:

> Hello Abhishek,
>
> I think you may find this white paper useful.  This document talks about
> offloading Teradata with Hadoop. It also talks about capacity and savings
> costs using Hadoop solutions.
>
>
> http://dataconomy.com/wp-content/uploads/2014/06/Syncsort-A-123ApproachtoTeradataOffloadwithHadoop.pdf
>
>
> *Cheers*
> *Nagaraj C*
>
> *Learn And Share! It’s Big Data.*
>
>
> From: Sandesh Hegde <sa...@gmail.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Date: Wednesday, September 9, 2015 at 1:24 AM
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Subject: Re: ETL/DW to Hadoop migrations
>
> Hello Abhishek,
>
> Below is a link to Free data ingestion tool, dtIngest, this runs on Hadoop
> as Yarn app. Support various data sources.
> Currently it doesn't have a support for Databases, future versions may
> have it. For database you can try Apache Sqoop.
>
> https://www.datatorrent.com/product/datatorrent-dtingest/
>
> https://www.datatorrent.com/dtingest-unified-streaming-batch-data-ingestion-hadoop/
>
> Thanks
> Sandesh
>
> PS: I work for DataTorrent.
>
> On Tue, Sep 8, 2015 at 9:57 AM, Abhishek Singh <23...@gmail.com>
> wrote:
>
>> Hi Kishore,
>>
>> Thanks for reverting. We are planning to do a POC in such a manner that
>> we can replace Datastage. Datastage and Teradata are costly tools which is
>> making a big hole in pocket. So, have you come across anything where ETL
>> pipeline could be replaced with Hadoop? I understand about connectors which
>> you are saying, but how about replacing an ETL tool?
>>
>> Any links would do more than good.
>>
>> Thanks once again.
>>
>> Abhishek
>>
>> On Tue, Sep 8, 2015 at 9:28 AM, Krishna Kishore Bonagiri <
>> write2kishore@gmail.com> wrote:
>>
>>> Abhishek,
>>>
>>>    Are you looking for loading your data into Hadoop? if yes, IBM
>>> DataStage has a stage called BDFS that loads/writes your data into Hadoop.
>>>
>>> Thanks,
>>> Kishore
>>>
>>> On Tue, Sep 8, 2015 at 1:29 AM, <23...@gmail.com> wrote:
>>>
>>>> Hi guys,
>>>>
>>>> I am looking for pointers on migrating existing data warehouse to
>>>> Hadoop.  Currently,  we are using IBM Data stage an ETL tool and loading
>>>> into Teradata staging/maintain tables.  Please suggest an architecture
>>>> which reduces cost without much degrade in performance.  Has anyone of you
>>>> been a part of such migration before? If yes then please provide some
>>>> inputs,  especially on what aspects should we be taking care of.  Talking
>>>> about source data,  it is mainly in the form of flat files and database.
>>>>
>>>> Thanks in advance.
>>>>
>>>> Regards,
>>>>
>>>> Abhishek Singh
>>>>
>>>
>>>
>>
>

Re: ETL/DW to Hadoop migrations

Posted by Abhishek Singh <23...@gmail.com>.
Thank you Partho, Nagaraj, Sandesh and Kishore. Thanks for your insights.

On Wed, Sep 9, 2015 at 7:25 AM, Nagaraj Chandrashekar <
nchandrashekar@innominds.com> wrote:

> Hello Abhishek,
>
> I think you may find this white paper useful.  This document talks about
> offloading Teradata with Hadoop. It also talks about capacity and savings
> costs using Hadoop solutions.
>
>
> http://dataconomy.com/wp-content/uploads/2014/06/Syncsort-A-123ApproachtoTeradataOffloadwithHadoop.pdf
>
>
> *Cheers*
> *Nagaraj C*
>
> *Learn And Share! It’s Big Data.*
>
>
> From: Sandesh Hegde <sa...@gmail.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Date: Wednesday, September 9, 2015 at 1:24 AM
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Subject: Re: ETL/DW to Hadoop migrations
>
> Hello Abhishek,
>
> Below is a link to Free data ingestion tool, dtIngest, this runs on Hadoop
> as Yarn app. Support various data sources.
> Currently it doesn't have a support for Databases, future versions may
> have it. For database you can try Apache Sqoop.
>
> https://www.datatorrent.com/product/datatorrent-dtingest/
>
> https://www.datatorrent.com/dtingest-unified-streaming-batch-data-ingestion-hadoop/
>
> Thanks
> Sandesh
>
> PS: I work for DataTorrent.
>
> On Tue, Sep 8, 2015 at 9:57 AM, Abhishek Singh <23...@gmail.com>
> wrote:
>
>> Hi Kishore,
>>
>> Thanks for reverting. We are planning to do a POC in such a manner that
>> we can replace Datastage. Datastage and Teradata are costly tools which is
>> making a big hole in pocket. So, have you come across anything where ETL
>> pipeline could be replaced with Hadoop? I understand about connectors which
>> you are saying, but how about replacing an ETL tool?
>>
>> Any links would do more than good.
>>
>> Thanks once again.
>>
>> Abhishek
>>
>> On Tue, Sep 8, 2015 at 9:28 AM, Krishna Kishore Bonagiri <
>> write2kishore@gmail.com> wrote:
>>
>>> Abhishek,
>>>
>>>    Are you looking for loading your data into Hadoop? if yes, IBM
>>> DataStage has a stage called BDFS that loads/writes your data into Hadoop.
>>>
>>> Thanks,
>>> Kishore
>>>
>>> On Tue, Sep 8, 2015 at 1:29 AM, <23...@gmail.com> wrote:
>>>
>>>> Hi guys,
>>>>
>>>> I am looking for pointers on migrating existing data warehouse to
>>>> Hadoop.  Currently,  we are using IBM Data stage an ETL tool and loading
>>>> into Teradata staging/maintain tables.  Please suggest an architecture
>>>> which reduces cost without much degrade in performance.  Has anyone of you
>>>> been a part of such migration before? If yes then please provide some
>>>> inputs,  especially on what aspects should we be taking care of.  Talking
>>>> about source data,  it is mainly in the form of flat files and database.
>>>>
>>>> Thanks in advance.
>>>>
>>>> Regards,
>>>>
>>>> Abhishek Singh
>>>>
>>>
>>>
>>
>

Re: ETL/DW to Hadoop migrations

Posted by Nagaraj Chandrashekar <nc...@innominds.com>.
Hello Abhishek,

I think you may find this white paper useful.  This document talks about offloading Teradata with Hadoop. It also talks about capacity and savings costs using Hadoop solutions.
http://dataconomy.com/wp-content/uploads/2014/06/Syncsort-A-123ApproachtoTeradataOffloadwithHadoop.pdf

Cheers
Nagaraj C

Learn And Share! It's Big Data.


From: Sandesh Hegde <sa...@gmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Wednesday, September 9, 2015 at 1:24 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: ETL/DW to Hadoop migrations

Hello Abhishek,

Below is a link to Free data ingestion tool, dtIngest, this runs on Hadoop as Yarn app. Support various data sources.
Currently it doesn't have a support for Databases, future versions may have it. For database you can try Apache Sqoop.

https://www.datatorrent.com/product/datatorrent-dtingest/
https://www.datatorrent.com/dtingest-unified-streaming-batch-data-ingestion-hadoop/

Thanks
Sandesh

PS: I work for DataTorrent.

On Tue, Sep 8, 2015 at 9:57 AM, Abhishek Singh <23...@gmail.com>> wrote:
Hi Kishore,

Thanks for reverting. We are planning to do a POC in such a manner that we can replace Datastage. Datastage and Teradata are costly tools which is making a big hole in pocket. So, have you come across anything where ETL pipeline could be replaced with Hadoop? I understand about connectors which you are saying, but how about replacing an ETL tool?

Any links would do more than good.

Thanks once again.

Abhishek

On Tue, Sep 8, 2015 at 9:28 AM, Krishna Kishore Bonagiri <wr...@gmail.com>> wrote:
Abhishek,

   Are you looking for loading your data into Hadoop? if yes, IBM DataStage has a stage called BDFS that loads/writes your data into Hadoop.

Thanks,
Kishore

On Tue, Sep 8, 2015 at 1:29 AM, <23...@gmail.com>> wrote:

Hi guys,

I am looking for pointers on migrating existing data warehouse to Hadoop.  Currently,  we are using IBM Data stage an ETL tool and loading into Teradata staging/maintain tables.  Please suggest an architecture which reduces cost without much degrade in performance.  Has anyone of you been a part of such migration before? If yes then please provide some inputs,  especially on what aspects should we be taking care of.  Talking about source data,  it is mainly in the form of flat files and database.

Thanks in advance.

Regards,

Abhishek Singh




Re: ETL/DW to Hadoop migrations

Posted by Nagaraj Chandrashekar <nc...@innominds.com>.
Hello Abhishek,

I think you may find this white paper useful.  This document talks about offloading Teradata with Hadoop. It also talks about capacity and savings costs using Hadoop solutions.
http://dataconomy.com/wp-content/uploads/2014/06/Syncsort-A-123ApproachtoTeradataOffloadwithHadoop.pdf

Cheers
Nagaraj C

Learn And Share! It's Big Data.


From: Sandesh Hegde <sa...@gmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Wednesday, September 9, 2015 at 1:24 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: ETL/DW to Hadoop migrations

Hello Abhishek,

Below is a link to Free data ingestion tool, dtIngest, this runs on Hadoop as Yarn app. Support various data sources.
Currently it doesn't have a support for Databases, future versions may have it. For database you can try Apache Sqoop.

https://www.datatorrent.com/product/datatorrent-dtingest/
https://www.datatorrent.com/dtingest-unified-streaming-batch-data-ingestion-hadoop/

Thanks
Sandesh

PS: I work for DataTorrent.

On Tue, Sep 8, 2015 at 9:57 AM, Abhishek Singh <23...@gmail.com>> wrote:
Hi Kishore,

Thanks for reverting. We are planning to do a POC in such a manner that we can replace Datastage. Datastage and Teradata are costly tools which is making a big hole in pocket. So, have you come across anything where ETL pipeline could be replaced with Hadoop? I understand about connectors which you are saying, but how about replacing an ETL tool?

Any links would do more than good.

Thanks once again.

Abhishek

On Tue, Sep 8, 2015 at 9:28 AM, Krishna Kishore Bonagiri <wr...@gmail.com>> wrote:
Abhishek,

   Are you looking for loading your data into Hadoop? if yes, IBM DataStage has a stage called BDFS that loads/writes your data into Hadoop.

Thanks,
Kishore

On Tue, Sep 8, 2015 at 1:29 AM, <23...@gmail.com>> wrote:

Hi guys,

I am looking for pointers on migrating existing data warehouse to Hadoop.  Currently,  we are using IBM Data stage an ETL tool and loading into Teradata staging/maintain tables.  Please suggest an architecture which reduces cost without much degrade in performance.  Has anyone of you been a part of such migration before? If yes then please provide some inputs,  especially on what aspects should we be taking care of.  Talking about source data,  it is mainly in the form of flat files and database.

Thanks in advance.

Regards,

Abhishek Singh




Re: ETL/DW to Hadoop migrations

Posted by Partho Bardhan <pa...@gmail.com>.
Hi Abhishek,

Indeed these tools are expensive and we see many customers opting for a lot
of open source projects to reduce license costs, prevent vendor lock-in and
have a steady group of software engineers making the projects more industry
friendly.

If you are looking at offloading from Datastage and Teradata, I could
suggest these open source projects

http://projects.spring.io/spring-xd/

Spring XD allows you to ingest data from traditional EDW, streams and flat
files. It is designed for parallel, high-speed ingest.

Another project allow you to perform SQL directly from flat files is
Pivotal Hawq. Pivotal Hawq is a SQL on Hadoop analytic engine. You can find
more about it on

http://pivotal.io/big-data/pivotal-hawq

If you would like to try out some of our Enterprise Grade as well as Beta
offerings, please look here: https://network.pivotal.io

Thanks

Partho Bardhan
Data Engineering
Pivotal



On Tue, Sep 8, 2015 at 2:54 PM, Sandesh Hegde <sa...@gmail.com>
wrote:

> Hello Abhishek,
>
> Below is a link to Free data ingestion tool, dtIngest, this runs on Hadoop
> as Yarn app. Support various data sources.
> Currently it doesn't have a support for Databases, future versions may
> have it. For database you can try Apache Sqoop.
>
> https://www.datatorrent.com/product/datatorrent-dtingest/
>
> https://www.datatorrent.com/dtingest-unified-streaming-batch-data-ingestion-hadoop/
>
> Thanks
> Sandesh
>
> PS: I work for DataTorrent.
>
> On Tue, Sep 8, 2015 at 9:57 AM, Abhishek Singh <23...@gmail.com>
> wrote:
>
>> Hi Kishore,
>>
>> Thanks for reverting. We are planning to do a POC in such a manner that
>> we can replace Datastage. Datastage and Teradata are costly tools which is
>> making a big hole in pocket. So, have you come across anything where ETL
>> pipeline could be replaced with Hadoop? I understand about connectors which
>> you are saying, but how about replacing an ETL tool?
>>
>> Any links would do more than good.
>>
>> Thanks once again.
>>
>> Abhishek
>>
>> On Tue, Sep 8, 2015 at 9:28 AM, Krishna Kishore Bonagiri <
>> write2kishore@gmail.com> wrote:
>>
>>> Abhishek,
>>>
>>>    Are you looking for loading your data into Hadoop? if yes, IBM
>>> DataStage has a stage called BDFS that loads/writes your data into Hadoop.
>>>
>>> Thanks,
>>> Kishore
>>>
>>> On Tue, Sep 8, 2015 at 1:29 AM, <23...@gmail.com> wrote:
>>>
>>>> Hi guys,
>>>>
>>>> I am looking for pointers on migrating existing data warehouse to
>>>> Hadoop.  Currently,  we are using IBM Data stage an ETL tool and loading
>>>> into Teradata staging/maintain tables.  Please suggest an architecture
>>>> which reduces cost without much degrade in performance.  Has anyone of you
>>>> been a part of such migration before? If yes then please provide some
>>>> inputs,  especially on what aspects should we be taking care of.  Talking
>>>> about source data,  it is mainly in the form of flat files and database.
>>>>
>>>> Thanks in advance.
>>>>
>>>> Regards,
>>>>
>>>> Abhishek Singh
>>>>
>>>
>>>
>>
>

Re: ETL/DW to Hadoop migrations

Posted by Partho Bardhan <pa...@gmail.com>.
Hi Abhishek,

Indeed these tools are expensive and we see many customers opting for a lot
of open source projects to reduce license costs, prevent vendor lock-in and
have a steady group of software engineers making the projects more industry
friendly.

If you are looking at offloading from Datastage and Teradata, I could
suggest these open source projects

http://projects.spring.io/spring-xd/

Spring XD allows you to ingest data from traditional EDW, streams and flat
files. It is designed for parallel, high-speed ingest.

Another project allow you to perform SQL directly from flat files is
Pivotal Hawq. Pivotal Hawq is a SQL on Hadoop analytic engine. You can find
more about it on

http://pivotal.io/big-data/pivotal-hawq

If you would like to try out some of our Enterprise Grade as well as Beta
offerings, please look here: https://network.pivotal.io

Thanks

Partho Bardhan
Data Engineering
Pivotal



On Tue, Sep 8, 2015 at 2:54 PM, Sandesh Hegde <sa...@gmail.com>
wrote:

> Hello Abhishek,
>
> Below is a link to Free data ingestion tool, dtIngest, this runs on Hadoop
> as Yarn app. Support various data sources.
> Currently it doesn't have a support for Databases, future versions may
> have it. For database you can try Apache Sqoop.
>
> https://www.datatorrent.com/product/datatorrent-dtingest/
>
> https://www.datatorrent.com/dtingest-unified-streaming-batch-data-ingestion-hadoop/
>
> Thanks
> Sandesh
>
> PS: I work for DataTorrent.
>
> On Tue, Sep 8, 2015 at 9:57 AM, Abhishek Singh <23...@gmail.com>
> wrote:
>
>> Hi Kishore,
>>
>> Thanks for reverting. We are planning to do a POC in such a manner that
>> we can replace Datastage. Datastage and Teradata are costly tools which is
>> making a big hole in pocket. So, have you come across anything where ETL
>> pipeline could be replaced with Hadoop? I understand about connectors which
>> you are saying, but how about replacing an ETL tool?
>>
>> Any links would do more than good.
>>
>> Thanks once again.
>>
>> Abhishek
>>
>> On Tue, Sep 8, 2015 at 9:28 AM, Krishna Kishore Bonagiri <
>> write2kishore@gmail.com> wrote:
>>
>>> Abhishek,
>>>
>>>    Are you looking for loading your data into Hadoop? if yes, IBM
>>> DataStage has a stage called BDFS that loads/writes your data into Hadoop.
>>>
>>> Thanks,
>>> Kishore
>>>
>>> On Tue, Sep 8, 2015 at 1:29 AM, <23...@gmail.com> wrote:
>>>
>>>> Hi guys,
>>>>
>>>> I am looking for pointers on migrating existing data warehouse to
>>>> Hadoop.  Currently,  we are using IBM Data stage an ETL tool and loading
>>>> into Teradata staging/maintain tables.  Please suggest an architecture
>>>> which reduces cost without much degrade in performance.  Has anyone of you
>>>> been a part of such migration before? If yes then please provide some
>>>> inputs,  especially on what aspects should we be taking care of.  Talking
>>>> about source data,  it is mainly in the form of flat files and database.
>>>>
>>>> Thanks in advance.
>>>>
>>>> Regards,
>>>>
>>>> Abhishek Singh
>>>>
>>>
>>>
>>
>

Re: ETL/DW to Hadoop migrations

Posted by Partho Bardhan <pa...@gmail.com>.
Hi Abhishek,

Indeed these tools are expensive and we see many customers opting for a lot
of open source projects to reduce license costs, prevent vendor lock-in and
have a steady group of software engineers making the projects more industry
friendly.

If you are looking at offloading from Datastage and Teradata, I could
suggest these open source projects

http://projects.spring.io/spring-xd/

Spring XD allows you to ingest data from traditional EDW, streams and flat
files. It is designed for parallel, high-speed ingest.

Another project allow you to perform SQL directly from flat files is
Pivotal Hawq. Pivotal Hawq is a SQL on Hadoop analytic engine. You can find
more about it on

http://pivotal.io/big-data/pivotal-hawq

If you would like to try out some of our Enterprise Grade as well as Beta
offerings, please look here: https://network.pivotal.io

Thanks

Partho Bardhan
Data Engineering
Pivotal



On Tue, Sep 8, 2015 at 2:54 PM, Sandesh Hegde <sa...@gmail.com>
wrote:

> Hello Abhishek,
>
> Below is a link to Free data ingestion tool, dtIngest, this runs on Hadoop
> as Yarn app. Support various data sources.
> Currently it doesn't have a support for Databases, future versions may
> have it. For database you can try Apache Sqoop.
>
> https://www.datatorrent.com/product/datatorrent-dtingest/
>
> https://www.datatorrent.com/dtingest-unified-streaming-batch-data-ingestion-hadoop/
>
> Thanks
> Sandesh
>
> PS: I work for DataTorrent.
>
> On Tue, Sep 8, 2015 at 9:57 AM, Abhishek Singh <23...@gmail.com>
> wrote:
>
>> Hi Kishore,
>>
>> Thanks for reverting. We are planning to do a POC in such a manner that
>> we can replace Datastage. Datastage and Teradata are costly tools which is
>> making a big hole in pocket. So, have you come across anything where ETL
>> pipeline could be replaced with Hadoop? I understand about connectors which
>> you are saying, but how about replacing an ETL tool?
>>
>> Any links would do more than good.
>>
>> Thanks once again.
>>
>> Abhishek
>>
>> On Tue, Sep 8, 2015 at 9:28 AM, Krishna Kishore Bonagiri <
>> write2kishore@gmail.com> wrote:
>>
>>> Abhishek,
>>>
>>>    Are you looking for loading your data into Hadoop? if yes, IBM
>>> DataStage has a stage called BDFS that loads/writes your data into Hadoop.
>>>
>>> Thanks,
>>> Kishore
>>>
>>> On Tue, Sep 8, 2015 at 1:29 AM, <23...@gmail.com> wrote:
>>>
>>>> Hi guys,
>>>>
>>>> I am looking for pointers on migrating existing data warehouse to
>>>> Hadoop.  Currently,  we are using IBM Data stage an ETL tool and loading
>>>> into Teradata staging/maintain tables.  Please suggest an architecture
>>>> which reduces cost without much degrade in performance.  Has anyone of you
>>>> been a part of such migration before? If yes then please provide some
>>>> inputs,  especially on what aspects should we be taking care of.  Talking
>>>> about source data,  it is mainly in the form of flat files and database.
>>>>
>>>> Thanks in advance.
>>>>
>>>> Regards,
>>>>
>>>> Abhishek Singh
>>>>
>>>
>>>
>>
>

Re: ETL/DW to Hadoop migrations

Posted by Nagaraj Chandrashekar <nc...@innominds.com>.
Hello Abhishek,

I think you may find this white paper useful.  This document talks about offloading Teradata with Hadoop. It also talks about capacity and savings costs using Hadoop solutions.
http://dataconomy.com/wp-content/uploads/2014/06/Syncsort-A-123ApproachtoTeradataOffloadwithHadoop.pdf

Cheers
Nagaraj C

Learn And Share! It's Big Data.


From: Sandesh Hegde <sa...@gmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Wednesday, September 9, 2015 at 1:24 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: ETL/DW to Hadoop migrations

Hello Abhishek,

Below is a link to Free data ingestion tool, dtIngest, this runs on Hadoop as Yarn app. Support various data sources.
Currently it doesn't have a support for Databases, future versions may have it. For database you can try Apache Sqoop.

https://www.datatorrent.com/product/datatorrent-dtingest/
https://www.datatorrent.com/dtingest-unified-streaming-batch-data-ingestion-hadoop/

Thanks
Sandesh

PS: I work for DataTorrent.

On Tue, Sep 8, 2015 at 9:57 AM, Abhishek Singh <23...@gmail.com>> wrote:
Hi Kishore,

Thanks for reverting. We are planning to do a POC in such a manner that we can replace Datastage. Datastage and Teradata are costly tools which is making a big hole in pocket. So, have you come across anything where ETL pipeline could be replaced with Hadoop? I understand about connectors which you are saying, but how about replacing an ETL tool?

Any links would do more than good.

Thanks once again.

Abhishek

On Tue, Sep 8, 2015 at 9:28 AM, Krishna Kishore Bonagiri <wr...@gmail.com>> wrote:
Abhishek,

   Are you looking for loading your data into Hadoop? if yes, IBM DataStage has a stage called BDFS that loads/writes your data into Hadoop.

Thanks,
Kishore

On Tue, Sep 8, 2015 at 1:29 AM, <23...@gmail.com>> wrote:

Hi guys,

I am looking for pointers on migrating existing data warehouse to Hadoop.  Currently,  we are using IBM Data stage an ETL tool and loading into Teradata staging/maintain tables.  Please suggest an architecture which reduces cost without much degrade in performance.  Has anyone of you been a part of such migration before? If yes then please provide some inputs,  especially on what aspects should we be taking care of.  Talking about source data,  it is mainly in the form of flat files and database.

Thanks in advance.

Regards,

Abhishek Singh




Re: ETL/DW to Hadoop migrations

Posted by Partho Bardhan <pa...@gmail.com>.
Hi Abhishek,

Indeed these tools are expensive and we see many customers opting for a lot
of open source projects to reduce license costs, prevent vendor lock-in and
have a steady group of software engineers making the projects more industry
friendly.

If you are looking at offloading from Datastage and Teradata, I could
suggest these open source projects

http://projects.spring.io/spring-xd/

Spring XD allows you to ingest data from traditional EDW, streams and flat
files. It is designed for parallel, high-speed ingest.

Another project allow you to perform SQL directly from flat files is
Pivotal Hawq. Pivotal Hawq is a SQL on Hadoop analytic engine. You can find
more about it on

http://pivotal.io/big-data/pivotal-hawq

If you would like to try out some of our Enterprise Grade as well as Beta
offerings, please look here: https://network.pivotal.io

Thanks

Partho Bardhan
Data Engineering
Pivotal



On Tue, Sep 8, 2015 at 2:54 PM, Sandesh Hegde <sa...@gmail.com>
wrote:

> Hello Abhishek,
>
> Below is a link to Free data ingestion tool, dtIngest, this runs on Hadoop
> as Yarn app. Support various data sources.
> Currently it doesn't have a support for Databases, future versions may
> have it. For database you can try Apache Sqoop.
>
> https://www.datatorrent.com/product/datatorrent-dtingest/
>
> https://www.datatorrent.com/dtingest-unified-streaming-batch-data-ingestion-hadoop/
>
> Thanks
> Sandesh
>
> PS: I work for DataTorrent.
>
> On Tue, Sep 8, 2015 at 9:57 AM, Abhishek Singh <23...@gmail.com>
> wrote:
>
>> Hi Kishore,
>>
>> Thanks for reverting. We are planning to do a POC in such a manner that
>> we can replace Datastage. Datastage and Teradata are costly tools which is
>> making a big hole in pocket. So, have you come across anything where ETL
>> pipeline could be replaced with Hadoop? I understand about connectors which
>> you are saying, but how about replacing an ETL tool?
>>
>> Any links would do more than good.
>>
>> Thanks once again.
>>
>> Abhishek
>>
>> On Tue, Sep 8, 2015 at 9:28 AM, Krishna Kishore Bonagiri <
>> write2kishore@gmail.com> wrote:
>>
>>> Abhishek,
>>>
>>>    Are you looking for loading your data into Hadoop? if yes, IBM
>>> DataStage has a stage called BDFS that loads/writes your data into Hadoop.
>>>
>>> Thanks,
>>> Kishore
>>>
>>> On Tue, Sep 8, 2015 at 1:29 AM, <23...@gmail.com> wrote:
>>>
>>>> Hi guys,
>>>>
>>>> I am looking for pointers on migrating existing data warehouse to
>>>> Hadoop.  Currently,  we are using IBM Data stage an ETL tool and loading
>>>> into Teradata staging/maintain tables.  Please suggest an architecture
>>>> which reduces cost without much degrade in performance.  Has anyone of you
>>>> been a part of such migration before? If yes then please provide some
>>>> inputs,  especially on what aspects should we be taking care of.  Talking
>>>> about source data,  it is mainly in the form of flat files and database.
>>>>
>>>> Thanks in advance.
>>>>
>>>> Regards,
>>>>
>>>> Abhishek Singh
>>>>
>>>
>>>
>>
>

Re: ETL/DW to Hadoop migrations

Posted by Nagaraj Chandrashekar <nc...@innominds.com>.
Hello Abhishek,

I think you may find this white paper useful.  This document talks about offloading Teradata with Hadoop. It also talks about capacity and savings costs using Hadoop solutions.
http://dataconomy.com/wp-content/uploads/2014/06/Syncsort-A-123ApproachtoTeradataOffloadwithHadoop.pdf

Cheers
Nagaraj C

Learn And Share! It's Big Data.


From: Sandesh Hegde <sa...@gmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Wednesday, September 9, 2015 at 1:24 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: ETL/DW to Hadoop migrations

Hello Abhishek,

Below is a link to Free data ingestion tool, dtIngest, this runs on Hadoop as Yarn app. Support various data sources.
Currently it doesn't have a support for Databases, future versions may have it. For database you can try Apache Sqoop.

https://www.datatorrent.com/product/datatorrent-dtingest/
https://www.datatorrent.com/dtingest-unified-streaming-batch-data-ingestion-hadoop/

Thanks
Sandesh

PS: I work for DataTorrent.

On Tue, Sep 8, 2015 at 9:57 AM, Abhishek Singh <23...@gmail.com>> wrote:
Hi Kishore,

Thanks for reverting. We are planning to do a POC in such a manner that we can replace Datastage. Datastage and Teradata are costly tools which is making a big hole in pocket. So, have you come across anything where ETL pipeline could be replaced with Hadoop? I understand about connectors which you are saying, but how about replacing an ETL tool?

Any links would do more than good.

Thanks once again.

Abhishek

On Tue, Sep 8, 2015 at 9:28 AM, Krishna Kishore Bonagiri <wr...@gmail.com>> wrote:
Abhishek,

   Are you looking for loading your data into Hadoop? if yes, IBM DataStage has a stage called BDFS that loads/writes your data into Hadoop.

Thanks,
Kishore

On Tue, Sep 8, 2015 at 1:29 AM, <23...@gmail.com>> wrote:

Hi guys,

I am looking for pointers on migrating existing data warehouse to Hadoop.  Currently,  we are using IBM Data stage an ETL tool and loading into Teradata staging/maintain tables.  Please suggest an architecture which reduces cost without much degrade in performance.  Has anyone of you been a part of such migration before? If yes then please provide some inputs,  especially on what aspects should we be taking care of.  Talking about source data,  it is mainly in the form of flat files and database.

Thanks in advance.

Regards,

Abhishek Singh




Re: ETL/DW to Hadoop migrations

Posted by Sandesh Hegde <sa...@gmail.com>.
Hello Abhishek,

Below is a link to Free data ingestion tool, dtIngest, this runs on Hadoop
as Yarn app. Support various data sources.
Currently it doesn't have a support for Databases, future versions may have
it. For database you can try Apache Sqoop.

https://www.datatorrent.com/product/datatorrent-dtingest/
https://www.datatorrent.com/dtingest-unified-streaming-batch-data-ingestion-hadoop/

Thanks
Sandesh

PS: I work for DataTorrent.

On Tue, Sep 8, 2015 at 9:57 AM, Abhishek Singh <23...@gmail.com>
wrote:

> Hi Kishore,
>
> Thanks for reverting. We are planning to do a POC in such a manner that we
> can replace Datastage. Datastage and Teradata are costly tools which is
> making a big hole in pocket. So, have you come across anything where ETL
> pipeline could be replaced with Hadoop? I understand about connectors which
> you are saying, but how about replacing an ETL tool?
>
> Any links would do more than good.
>
> Thanks once again.
>
> Abhishek
>
> On Tue, Sep 8, 2015 at 9:28 AM, Krishna Kishore Bonagiri <
> write2kishore@gmail.com> wrote:
>
>> Abhishek,
>>
>>    Are you looking for loading your data into Hadoop? if yes, IBM
>> DataStage has a stage called BDFS that loads/writes your data into Hadoop.
>>
>> Thanks,
>> Kishore
>>
>> On Tue, Sep 8, 2015 at 1:29 AM, <23...@gmail.com> wrote:
>>
>>> Hi guys,
>>>
>>> I am looking for pointers on migrating existing data warehouse to
>>> Hadoop.  Currently,  we are using IBM Data stage an ETL tool and loading
>>> into Teradata staging/maintain tables.  Please suggest an architecture
>>> which reduces cost without much degrade in performance.  Has anyone of you
>>> been a part of such migration before? If yes then please provide some
>>> inputs,  especially on what aspects should we be taking care of.  Talking
>>> about source data,  it is mainly in the form of flat files and database.
>>>
>>> Thanks in advance.
>>>
>>> Regards,
>>>
>>> Abhishek Singh
>>>
>>
>>
>

Re: ETL/DW to Hadoop migrations

Posted by Sandesh Hegde <sa...@gmail.com>.
Hello Abhishek,

Below is a link to Free data ingestion tool, dtIngest, this runs on Hadoop
as Yarn app. Support various data sources.
Currently it doesn't have a support for Databases, future versions may have
it. For database you can try Apache Sqoop.

https://www.datatorrent.com/product/datatorrent-dtingest/
https://www.datatorrent.com/dtingest-unified-streaming-batch-data-ingestion-hadoop/

Thanks
Sandesh

PS: I work for DataTorrent.

On Tue, Sep 8, 2015 at 9:57 AM, Abhishek Singh <23...@gmail.com>
wrote:

> Hi Kishore,
>
> Thanks for reverting. We are planning to do a POC in such a manner that we
> can replace Datastage. Datastage and Teradata are costly tools which is
> making a big hole in pocket. So, have you come across anything where ETL
> pipeline could be replaced with Hadoop? I understand about connectors which
> you are saying, but how about replacing an ETL tool?
>
> Any links would do more than good.
>
> Thanks once again.
>
> Abhishek
>
> On Tue, Sep 8, 2015 at 9:28 AM, Krishna Kishore Bonagiri <
> write2kishore@gmail.com> wrote:
>
>> Abhishek,
>>
>>    Are you looking for loading your data into Hadoop? if yes, IBM
>> DataStage has a stage called BDFS that loads/writes your data into Hadoop.
>>
>> Thanks,
>> Kishore
>>
>> On Tue, Sep 8, 2015 at 1:29 AM, <23...@gmail.com> wrote:
>>
>>> Hi guys,
>>>
>>> I am looking for pointers on migrating existing data warehouse to
>>> Hadoop.  Currently,  we are using IBM Data stage an ETL tool and loading
>>> into Teradata staging/maintain tables.  Please suggest an architecture
>>> which reduces cost without much degrade in performance.  Has anyone of you
>>> been a part of such migration before? If yes then please provide some
>>> inputs,  especially on what aspects should we be taking care of.  Talking
>>> about source data,  it is mainly in the form of flat files and database.
>>>
>>> Thanks in advance.
>>>
>>> Regards,
>>>
>>> Abhishek Singh
>>>
>>
>>
>

Re: ETL/DW to Hadoop migrations

Posted by Sandesh Hegde <sa...@gmail.com>.
Hello Abhishek,

Below is a link to Free data ingestion tool, dtIngest, this runs on Hadoop
as Yarn app. Support various data sources.
Currently it doesn't have a support for Databases, future versions may have
it. For database you can try Apache Sqoop.

https://www.datatorrent.com/product/datatorrent-dtingest/
https://www.datatorrent.com/dtingest-unified-streaming-batch-data-ingestion-hadoop/

Thanks
Sandesh

PS: I work for DataTorrent.

On Tue, Sep 8, 2015 at 9:57 AM, Abhishek Singh <23...@gmail.com>
wrote:

> Hi Kishore,
>
> Thanks for reverting. We are planning to do a POC in such a manner that we
> can replace Datastage. Datastage and Teradata are costly tools which is
> making a big hole in pocket. So, have you come across anything where ETL
> pipeline could be replaced with Hadoop? I understand about connectors which
> you are saying, but how about replacing an ETL tool?
>
> Any links would do more than good.
>
> Thanks once again.
>
> Abhishek
>
> On Tue, Sep 8, 2015 at 9:28 AM, Krishna Kishore Bonagiri <
> write2kishore@gmail.com> wrote:
>
>> Abhishek,
>>
>>    Are you looking for loading your data into Hadoop? if yes, IBM
>> DataStage has a stage called BDFS that loads/writes your data into Hadoop.
>>
>> Thanks,
>> Kishore
>>
>> On Tue, Sep 8, 2015 at 1:29 AM, <23...@gmail.com> wrote:
>>
>>> Hi guys,
>>>
>>> I am looking for pointers on migrating existing data warehouse to
>>> Hadoop.  Currently,  we are using IBM Data stage an ETL tool and loading
>>> into Teradata staging/maintain tables.  Please suggest an architecture
>>> which reduces cost without much degrade in performance.  Has anyone of you
>>> been a part of such migration before? If yes then please provide some
>>> inputs,  especially on what aspects should we be taking care of.  Talking
>>> about source data,  it is mainly in the form of flat files and database.
>>>
>>> Thanks in advance.
>>>
>>> Regards,
>>>
>>> Abhishek Singh
>>>
>>
>>
>

Re: ETL/DW to Hadoop migrations

Posted by Sandesh Hegde <sa...@gmail.com>.
Hello Abhishek,

Below is a link to Free data ingestion tool, dtIngest, this runs on Hadoop
as Yarn app. Support various data sources.
Currently it doesn't have a support for Databases, future versions may have
it. For database you can try Apache Sqoop.

https://www.datatorrent.com/product/datatorrent-dtingest/
https://www.datatorrent.com/dtingest-unified-streaming-batch-data-ingestion-hadoop/

Thanks
Sandesh

PS: I work for DataTorrent.

On Tue, Sep 8, 2015 at 9:57 AM, Abhishek Singh <23...@gmail.com>
wrote:

> Hi Kishore,
>
> Thanks for reverting. We are planning to do a POC in such a manner that we
> can replace Datastage. Datastage and Teradata are costly tools which is
> making a big hole in pocket. So, have you come across anything where ETL
> pipeline could be replaced with Hadoop? I understand about connectors which
> you are saying, but how about replacing an ETL tool?
>
> Any links would do more than good.
>
> Thanks once again.
>
> Abhishek
>
> On Tue, Sep 8, 2015 at 9:28 AM, Krishna Kishore Bonagiri <
> write2kishore@gmail.com> wrote:
>
>> Abhishek,
>>
>>    Are you looking for loading your data into Hadoop? if yes, IBM
>> DataStage has a stage called BDFS that loads/writes your data into Hadoop.
>>
>> Thanks,
>> Kishore
>>
>> On Tue, Sep 8, 2015 at 1:29 AM, <23...@gmail.com> wrote:
>>
>>> Hi guys,
>>>
>>> I am looking for pointers on migrating existing data warehouse to
>>> Hadoop.  Currently,  we are using IBM Data stage an ETL tool and loading
>>> into Teradata staging/maintain tables.  Please suggest an architecture
>>> which reduces cost without much degrade in performance.  Has anyone of you
>>> been a part of such migration before? If yes then please provide some
>>> inputs,  especially on what aspects should we be taking care of.  Talking
>>> about source data,  it is mainly in the form of flat files and database.
>>>
>>> Thanks in advance.
>>>
>>> Regards,
>>>
>>> Abhishek Singh
>>>
>>
>>
>

Re: ETL/DW to Hadoop migrations

Posted by Abhishek Singh <23...@gmail.com>.
Hi Kishore,

Thanks for reverting. We are planning to do a POC in such a manner that we
can replace Datastage. Datastage and Teradata are costly tools which is
making a big hole in pocket. So, have you come across anything where ETL
pipeline could be replaced with Hadoop? I understand about connectors which
you are saying, but how about replacing an ETL tool?

Any links would do more than good.

Thanks once again.

Abhishek

On Tue, Sep 8, 2015 at 9:28 AM, Krishna Kishore Bonagiri <
write2kishore@gmail.com> wrote:

> Abhishek,
>
>    Are you looking for loading your data into Hadoop? if yes, IBM
> DataStage has a stage called BDFS that loads/writes your data into Hadoop.
>
> Thanks,
> Kishore
>
> On Tue, Sep 8, 2015 at 1:29 AM, <23...@gmail.com> wrote:
>
>> Hi guys,
>>
>> I am looking for pointers on migrating existing data warehouse to
>> Hadoop.  Currently,  we are using IBM Data stage an ETL tool and loading
>> into Teradata staging/maintain tables.  Please suggest an architecture
>> which reduces cost without much degrade in performance.  Has anyone of you
>> been a part of such migration before? If yes then please provide some
>> inputs,  especially on what aspects should we be taking care of.  Talking
>> about source data,  it is mainly in the form of flat files and database.
>>
>> Thanks in advance.
>>
>> Regards,
>>
>> Abhishek Singh
>>
>
>

Re: ETL/DW to Hadoop migrations

Posted by Abhishek Singh <23...@gmail.com>.
Hi Kishore,

Thanks for reverting. We are planning to do a POC in such a manner that we
can replace Datastage. Datastage and Teradata are costly tools which is
making a big hole in pocket. So, have you come across anything where ETL
pipeline could be replaced with Hadoop? I understand about connectors which
you are saying, but how about replacing an ETL tool?

Any links would do more than good.

Thanks once again.

Abhishek

On Tue, Sep 8, 2015 at 9:28 AM, Krishna Kishore Bonagiri <
write2kishore@gmail.com> wrote:

> Abhishek,
>
>    Are you looking for loading your data into Hadoop? if yes, IBM
> DataStage has a stage called BDFS that loads/writes your data into Hadoop.
>
> Thanks,
> Kishore
>
> On Tue, Sep 8, 2015 at 1:29 AM, <23...@gmail.com> wrote:
>
>> Hi guys,
>>
>> I am looking for pointers on migrating existing data warehouse to
>> Hadoop.  Currently,  we are using IBM Data stage an ETL tool and loading
>> into Teradata staging/maintain tables.  Please suggest an architecture
>> which reduces cost without much degrade in performance.  Has anyone of you
>> been a part of such migration before? If yes then please provide some
>> inputs,  especially on what aspects should we be taking care of.  Talking
>> about source data,  it is mainly in the form of flat files and database.
>>
>> Thanks in advance.
>>
>> Regards,
>>
>> Abhishek Singh
>>
>
>

Re: ETL/DW to Hadoop migrations

Posted by Abhishek Singh <23...@gmail.com>.
Hi Kishore,

Thanks for reverting. We are planning to do a POC in such a manner that we
can replace Datastage. Datastage and Teradata are costly tools which is
making a big hole in pocket. So, have you come across anything where ETL
pipeline could be replaced with Hadoop? I understand about connectors which
you are saying, but how about replacing an ETL tool?

Any links would do more than good.

Thanks once again.

Abhishek

On Tue, Sep 8, 2015 at 9:28 AM, Krishna Kishore Bonagiri <
write2kishore@gmail.com> wrote:

> Abhishek,
>
>    Are you looking for loading your data into Hadoop? if yes, IBM
> DataStage has a stage called BDFS that loads/writes your data into Hadoop.
>
> Thanks,
> Kishore
>
> On Tue, Sep 8, 2015 at 1:29 AM, <23...@gmail.com> wrote:
>
>> Hi guys,
>>
>> I am looking for pointers on migrating existing data warehouse to
>> Hadoop.  Currently,  we are using IBM Data stage an ETL tool and loading
>> into Teradata staging/maintain tables.  Please suggest an architecture
>> which reduces cost without much degrade in performance.  Has anyone of you
>> been a part of such migration before? If yes then please provide some
>> inputs,  especially on what aspects should we be taking care of.  Talking
>> about source data,  it is mainly in the form of flat files and database.
>>
>> Thanks in advance.
>>
>> Regards,
>>
>> Abhishek Singh
>>
>
>

Re: ETL/DW to Hadoop migrations

Posted by Abhishek Singh <23...@gmail.com>.
Hi Kishore,

Thanks for reverting. We are planning to do a POC in such a manner that we
can replace Datastage. Datastage and Teradata are costly tools which is
making a big hole in pocket. So, have you come across anything where ETL
pipeline could be replaced with Hadoop? I understand about connectors which
you are saying, but how about replacing an ETL tool?

Any links would do more than good.

Thanks once again.

Abhishek

On Tue, Sep 8, 2015 at 9:28 AM, Krishna Kishore Bonagiri <
write2kishore@gmail.com> wrote:

> Abhishek,
>
>    Are you looking for loading your data into Hadoop? if yes, IBM
> DataStage has a stage called BDFS that loads/writes your data into Hadoop.
>
> Thanks,
> Kishore
>
> On Tue, Sep 8, 2015 at 1:29 AM, <23...@gmail.com> wrote:
>
>> Hi guys,
>>
>> I am looking for pointers on migrating existing data warehouse to
>> Hadoop.  Currently,  we are using IBM Data stage an ETL tool and loading
>> into Teradata staging/maintain tables.  Please suggest an architecture
>> which reduces cost without much degrade in performance.  Has anyone of you
>> been a part of such migration before? If yes then please provide some
>> inputs,  especially on what aspects should we be taking care of.  Talking
>> about source data,  it is mainly in the form of flat files and database.
>>
>> Thanks in advance.
>>
>> Regards,
>>
>> Abhishek Singh
>>
>
>

Re: ETL/DW to Hadoop migrations

Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
Abhishek,

   Are you looking for loading your data into Hadoop? if yes, IBM DataStage
has a stage called BDFS that loads/writes your data into Hadoop.

Thanks,
Kishore

On Tue, Sep 8, 2015 at 1:29 AM, <23...@gmail.com> wrote:

> Hi guys,
>
> I am looking for pointers on migrating existing data warehouse to Hadoop.
> Currently,  we are using IBM Data stage an ETL tool and loading into
> Teradata staging/maintain tables.  Please suggest an architecture which
> reduces cost without much degrade in performance.  Has anyone of you been a
> part of such migration before? If yes then please provide some inputs,
> especially on what aspects should we be taking care of.  Talking about
> source data,  it is mainly in the form of flat files and database.
>
> Thanks in advance.
>
> Regards,
>
> Abhishek Singh
>

Re: ETL/DW to Hadoop migrations

Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
Abhishek,

   Are you looking for loading your data into Hadoop? if yes, IBM DataStage
has a stage called BDFS that loads/writes your data into Hadoop.

Thanks,
Kishore

On Tue, Sep 8, 2015 at 1:29 AM, <23...@gmail.com> wrote:

> Hi guys,
>
> I am looking for pointers on migrating existing data warehouse to Hadoop.
> Currently,  we are using IBM Data stage an ETL tool and loading into
> Teradata staging/maintain tables.  Please suggest an architecture which
> reduces cost without much degrade in performance.  Has anyone of you been a
> part of such migration before? If yes then please provide some inputs,
> especially on what aspects should we be taking care of.  Talking about
> source data,  it is mainly in the form of flat files and database.
>
> Thanks in advance.
>
> Regards,
>
> Abhishek Singh
>

Re: ETL/DW to Hadoop migrations

Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
Abhishek,

   Are you looking for loading your data into Hadoop? if yes, IBM DataStage
has a stage called BDFS that loads/writes your data into Hadoop.

Thanks,
Kishore

On Tue, Sep 8, 2015 at 1:29 AM, <23...@gmail.com> wrote:

> Hi guys,
>
> I am looking for pointers on migrating existing data warehouse to Hadoop.
> Currently,  we are using IBM Data stage an ETL tool and loading into
> Teradata staging/maintain tables.  Please suggest an architecture which
> reduces cost without much degrade in performance.  Has anyone of you been a
> part of such migration before? If yes then please provide some inputs,
> especially on what aspects should we be taking care of.  Talking about
> source data,  it is mainly in the form of flat files and database.
>
> Thanks in advance.
>
> Regards,
>
> Abhishek Singh
>

Re: ETL/DW to Hadoop migrations

Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
Abhishek,

   Are you looking for loading your data into Hadoop? if yes, IBM DataStage
has a stage called BDFS that loads/writes your data into Hadoop.

Thanks,
Kishore

On Tue, Sep 8, 2015 at 1:29 AM, <23...@gmail.com> wrote:

> Hi guys,
>
> I am looking for pointers on migrating existing data warehouse to Hadoop.
> Currently,  we are using IBM Data stage an ETL tool and loading into
> Teradata staging/maintain tables.  Please suggest an architecture which
> reduces cost without much degrade in performance.  Has anyone of you been a
> part of such migration before? If yes then please provide some inputs,
> especially on what aspects should we be taking care of.  Talking about
> source data,  it is mainly in the form of flat files and database.
>
> Thanks in advance.
>
> Regards,
>
> Abhishek Singh
>