You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by 23...@gmail.com on 2015/09/07 21:59:26 UTC
ETL/DW to Hadoop migrations
Hi guys,
I am looking for pointers on migrating existing data warehouse to Hadoop. Currently, we are using IBM Data stage an ETL tool and loading into Teradata staging/maintain tables. Please suggest an architecture which reduces cost without much degrade in performance. Has anyone of you been a part of such migration before? If yes then please provide some inputs, especially on what aspects should we be taking care of. Talking about source data, it is mainly in the form of flat files and database.
Thanks in advance.
Regards,
Abhishek Singh
Re: ETL/DW to Hadoop migrations
Posted by Abhishek Singh <23...@gmail.com>.
Thank you Partho, Nagaraj, Sandesh and Kishore. Thanks for your insights.
On Wed, Sep 9, 2015 at 7:25 AM, Nagaraj Chandrashekar <
nchandrashekar@innominds.com> wrote:
> Hello Abhishek,
>
> I think you may find this white paper useful. This document talks about
> offloading Teradata with Hadoop. It also talks about capacity and savings
> costs using Hadoop solutions.
>
>
> http://dataconomy.com/wp-content/uploads/2014/06/Syncsort-A-123ApproachtoTeradataOffloadwithHadoop.pdf
>
>
> *Cheers*
> *Nagaraj C*
>
> *Learn And Share! It’s Big Data.*
>
>
> From: Sandesh Hegde <sa...@gmail.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Date: Wednesday, September 9, 2015 at 1:24 AM
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Subject: Re: ETL/DW to Hadoop migrations
>
> Hello Abhishek,
>
> Below is a link to Free data ingestion tool, dtIngest, this runs on Hadoop
> as Yarn app. Support various data sources.
> Currently it doesn't have a support for Databases, future versions may
> have it. For database you can try Apache Sqoop.
>
> https://www.datatorrent.com/product/datatorrent-dtingest/
>
> https://www.datatorrent.com/dtingest-unified-streaming-batch-data-ingestion-hadoop/
>
> Thanks
> Sandesh
>
> PS: I work for DataTorrent.
>
> On Tue, Sep 8, 2015 at 9:57 AM, Abhishek Singh <23...@gmail.com>
> wrote:
>
>> Hi Kishore,
>>
>> Thanks for reverting. We are planning to do a POC in such a manner that
>> we can replace Datastage. Datastage and Teradata are costly tools which is
>> making a big hole in pocket. So, have you come across anything where ETL
>> pipeline could be replaced with Hadoop? I understand about connectors which
>> you are saying, but how about replacing an ETL tool?
>>
>> Any links would do more than good.
>>
>> Thanks once again.
>>
>> Abhishek
>>
>> On Tue, Sep 8, 2015 at 9:28 AM, Krishna Kishore Bonagiri <
>> write2kishore@gmail.com> wrote:
>>
>>> Abhishek,
>>>
>>> Are you looking for loading your data into Hadoop? if yes, IBM
>>> DataStage has a stage called BDFS that loads/writes your data into Hadoop.
>>>
>>> Thanks,
>>> Kishore
>>>
>>> On Tue, Sep 8, 2015 at 1:29 AM, <23...@gmail.com> wrote:
>>>
>>>> Hi guys,
>>>>
>>>> I am looking for pointers on migrating existing data warehouse to
>>>> Hadoop. Currently, we are using IBM Data stage an ETL tool and loading
>>>> into Teradata staging/maintain tables. Please suggest an architecture
>>>> which reduces cost without much degrade in performance. Has anyone of you
>>>> been a part of such migration before? If yes then please provide some
>>>> inputs, especially on what aspects should we be taking care of. Talking
>>>> about source data, it is mainly in the form of flat files and database.
>>>>
>>>> Thanks in advance.
>>>>
>>>> Regards,
>>>>
>>>> Abhishek Singh
>>>>
>>>
>>>
>>
>
Re: ETL/DW to Hadoop migrations
Posted by Abhishek Singh <23...@gmail.com>.
Thank you Partho, Nagaraj, Sandesh and Kishore. Thanks for your insights.
On Wed, Sep 9, 2015 at 7:25 AM, Nagaraj Chandrashekar <
nchandrashekar@innominds.com> wrote:
> Hello Abhishek,
>
> I think you may find this white paper useful. This document talks about
> offloading Teradata with Hadoop. It also talks about capacity and savings
> costs using Hadoop solutions.
>
>
> http://dataconomy.com/wp-content/uploads/2014/06/Syncsort-A-123ApproachtoTeradataOffloadwithHadoop.pdf
>
>
> *Cheers*
> *Nagaraj C*
>
> *Learn And Share! It’s Big Data.*
>
>
> From: Sandesh Hegde <sa...@gmail.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Date: Wednesday, September 9, 2015 at 1:24 AM
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Subject: Re: ETL/DW to Hadoop migrations
>
> Hello Abhishek,
>
> Below is a link to Free data ingestion tool, dtIngest, this runs on Hadoop
> as Yarn app. Support various data sources.
> Currently it doesn't have a support for Databases, future versions may
> have it. For database you can try Apache Sqoop.
>
> https://www.datatorrent.com/product/datatorrent-dtingest/
>
> https://www.datatorrent.com/dtingest-unified-streaming-batch-data-ingestion-hadoop/
>
> Thanks
> Sandesh
>
> PS: I work for DataTorrent.
>
> On Tue, Sep 8, 2015 at 9:57 AM, Abhishek Singh <23...@gmail.com>
> wrote:
>
>> Hi Kishore,
>>
>> Thanks for reverting. We are planning to do a POC in such a manner that
>> we can replace Datastage. Datastage and Teradata are costly tools which is
>> making a big hole in pocket. So, have you come across anything where ETL
>> pipeline could be replaced with Hadoop? I understand about connectors which
>> you are saying, but how about replacing an ETL tool?
>>
>> Any links would do more than good.
>>
>> Thanks once again.
>>
>> Abhishek
>>
>> On Tue, Sep 8, 2015 at 9:28 AM, Krishna Kishore Bonagiri <
>> write2kishore@gmail.com> wrote:
>>
>>> Abhishek,
>>>
>>> Are you looking for loading your data into Hadoop? if yes, IBM
>>> DataStage has a stage called BDFS that loads/writes your data into Hadoop.
>>>
>>> Thanks,
>>> Kishore
>>>
>>> On Tue, Sep 8, 2015 at 1:29 AM, <23...@gmail.com> wrote:
>>>
>>>> Hi guys,
>>>>
>>>> I am looking for pointers on migrating existing data warehouse to
>>>> Hadoop. Currently, we are using IBM Data stage an ETL tool and loading
>>>> into Teradata staging/maintain tables. Please suggest an architecture
>>>> which reduces cost without much degrade in performance. Has anyone of you
>>>> been a part of such migration before? If yes then please provide some
>>>> inputs, especially on what aspects should we be taking care of. Talking
>>>> about source data, it is mainly in the form of flat files and database.
>>>>
>>>> Thanks in advance.
>>>>
>>>> Regards,
>>>>
>>>> Abhishek Singh
>>>>
>>>
>>>
>>
>
Re: ETL/DW to Hadoop migrations
Posted by Abhishek Singh <23...@gmail.com>.
Thank you Partho, Nagaraj, Sandesh and Kishore. Thanks for your insights.
On Wed, Sep 9, 2015 at 7:25 AM, Nagaraj Chandrashekar <
nchandrashekar@innominds.com> wrote:
> Hello Abhishek,
>
> I think you may find this white paper useful. This document talks about
> offloading Teradata with Hadoop. It also talks about capacity and savings
> costs using Hadoop solutions.
>
>
> http://dataconomy.com/wp-content/uploads/2014/06/Syncsort-A-123ApproachtoTeradataOffloadwithHadoop.pdf
>
>
> *Cheers*
> *Nagaraj C*
>
> *Learn And Share! It’s Big Data.*
>
>
> From: Sandesh Hegde <sa...@gmail.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Date: Wednesday, September 9, 2015 at 1:24 AM
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Subject: Re: ETL/DW to Hadoop migrations
>
> Hello Abhishek,
>
> Below is a link to Free data ingestion tool, dtIngest, this runs on Hadoop
> as Yarn app. Support various data sources.
> Currently it doesn't have a support for Databases, future versions may
> have it. For database you can try Apache Sqoop.
>
> https://www.datatorrent.com/product/datatorrent-dtingest/
>
> https://www.datatorrent.com/dtingest-unified-streaming-batch-data-ingestion-hadoop/
>
> Thanks
> Sandesh
>
> PS: I work for DataTorrent.
>
> On Tue, Sep 8, 2015 at 9:57 AM, Abhishek Singh <23...@gmail.com>
> wrote:
>
>> Hi Kishore,
>>
>> Thanks for reverting. We are planning to do a POC in such a manner that
>> we can replace Datastage. Datastage and Teradata are costly tools which is
>> making a big hole in pocket. So, have you come across anything where ETL
>> pipeline could be replaced with Hadoop? I understand about connectors which
>> you are saying, but how about replacing an ETL tool?
>>
>> Any links would do more than good.
>>
>> Thanks once again.
>>
>> Abhishek
>>
>> On Tue, Sep 8, 2015 at 9:28 AM, Krishna Kishore Bonagiri <
>> write2kishore@gmail.com> wrote:
>>
>>> Abhishek,
>>>
>>> Are you looking for loading your data into Hadoop? if yes, IBM
>>> DataStage has a stage called BDFS that loads/writes your data into Hadoop.
>>>
>>> Thanks,
>>> Kishore
>>>
>>> On Tue, Sep 8, 2015 at 1:29 AM, <23...@gmail.com> wrote:
>>>
>>>> Hi guys,
>>>>
>>>> I am looking for pointers on migrating existing data warehouse to
>>>> Hadoop. Currently, we are using IBM Data stage an ETL tool and loading
>>>> into Teradata staging/maintain tables. Please suggest an architecture
>>>> which reduces cost without much degrade in performance. Has anyone of you
>>>> been a part of such migration before? If yes then please provide some
>>>> inputs, especially on what aspects should we be taking care of. Talking
>>>> about source data, it is mainly in the form of flat files and database.
>>>>
>>>> Thanks in advance.
>>>>
>>>> Regards,
>>>>
>>>> Abhishek Singh
>>>>
>>>
>>>
>>
>
Re: ETL/DW to Hadoop migrations
Posted by Abhishek Singh <23...@gmail.com>.
Thank you Partho, Nagaraj, Sandesh and Kishore. Thanks for your insights.
On Wed, Sep 9, 2015 at 7:25 AM, Nagaraj Chandrashekar <
nchandrashekar@innominds.com> wrote:
> Hello Abhishek,
>
> I think you may find this white paper useful. This document talks about
> offloading Teradata with Hadoop. It also talks about capacity and savings
> costs using Hadoop solutions.
>
>
> http://dataconomy.com/wp-content/uploads/2014/06/Syncsort-A-123ApproachtoTeradataOffloadwithHadoop.pdf
>
>
> *Cheers*
> *Nagaraj C*
>
> *Learn And Share! It’s Big Data.*
>
>
> From: Sandesh Hegde <sa...@gmail.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Date: Wednesday, September 9, 2015 at 1:24 AM
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Subject: Re: ETL/DW to Hadoop migrations
>
> Hello Abhishek,
>
> Below is a link to Free data ingestion tool, dtIngest, this runs on Hadoop
> as Yarn app. Support various data sources.
> Currently it doesn't have a support for Databases, future versions may
> have it. For database you can try Apache Sqoop.
>
> https://www.datatorrent.com/product/datatorrent-dtingest/
>
> https://www.datatorrent.com/dtingest-unified-streaming-batch-data-ingestion-hadoop/
>
> Thanks
> Sandesh
>
> PS: I work for DataTorrent.
>
> On Tue, Sep 8, 2015 at 9:57 AM, Abhishek Singh <23...@gmail.com>
> wrote:
>
>> Hi Kishore,
>>
>> Thanks for reverting. We are planning to do a POC in such a manner that
>> we can replace Datastage. Datastage and Teradata are costly tools which is
>> making a big hole in pocket. So, have you come across anything where ETL
>> pipeline could be replaced with Hadoop? I understand about connectors which
>> you are saying, but how about replacing an ETL tool?
>>
>> Any links would do more than good.
>>
>> Thanks once again.
>>
>> Abhishek
>>
>> On Tue, Sep 8, 2015 at 9:28 AM, Krishna Kishore Bonagiri <
>> write2kishore@gmail.com> wrote:
>>
>>> Abhishek,
>>>
>>> Are you looking for loading your data into Hadoop? if yes, IBM
>>> DataStage has a stage called BDFS that loads/writes your data into Hadoop.
>>>
>>> Thanks,
>>> Kishore
>>>
>>> On Tue, Sep 8, 2015 at 1:29 AM, <23...@gmail.com> wrote:
>>>
>>>> Hi guys,
>>>>
>>>> I am looking for pointers on migrating existing data warehouse to
>>>> Hadoop. Currently, we are using IBM Data stage an ETL tool and loading
>>>> into Teradata staging/maintain tables. Please suggest an architecture
>>>> which reduces cost without much degrade in performance. Has anyone of you
>>>> been a part of such migration before? If yes then please provide some
>>>> inputs, especially on what aspects should we be taking care of. Talking
>>>> about source data, it is mainly in the form of flat files and database.
>>>>
>>>> Thanks in advance.
>>>>
>>>> Regards,
>>>>
>>>> Abhishek Singh
>>>>
>>>
>>>
>>
>
Re: ETL/DW to Hadoop migrations
Posted by Nagaraj Chandrashekar <nc...@innominds.com>.
Hello Abhishek,
I think you may find this white paper useful. This document talks about offloading Teradata with Hadoop. It also talks about capacity and savings costs using Hadoop solutions.
http://dataconomy.com/wp-content/uploads/2014/06/Syncsort-A-123ApproachtoTeradataOffloadwithHadoop.pdf
Cheers
Nagaraj C
Learn And Share! It's Big Data.
From: Sandesh Hegde <sa...@gmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Wednesday, September 9, 2015 at 1:24 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: ETL/DW to Hadoop migrations
Hello Abhishek,
Below is a link to Free data ingestion tool, dtIngest, this runs on Hadoop as Yarn app. Support various data sources.
Currently it doesn't have a support for Databases, future versions may have it. For database you can try Apache Sqoop.
https://www.datatorrent.com/product/datatorrent-dtingest/
https://www.datatorrent.com/dtingest-unified-streaming-batch-data-ingestion-hadoop/
Thanks
Sandesh
PS: I work for DataTorrent.
On Tue, Sep 8, 2015 at 9:57 AM, Abhishek Singh <23...@gmail.com>> wrote:
Hi Kishore,
Thanks for reverting. We are planning to do a POC in such a manner that we can replace Datastage. Datastage and Teradata are costly tools which is making a big hole in pocket. So, have you come across anything where ETL pipeline could be replaced with Hadoop? I understand about connectors which you are saying, but how about replacing an ETL tool?
Any links would do more than good.
Thanks once again.
Abhishek
On Tue, Sep 8, 2015 at 9:28 AM, Krishna Kishore Bonagiri <wr...@gmail.com>> wrote:
Abhishek,
Are you looking for loading your data into Hadoop? if yes, IBM DataStage has a stage called BDFS that loads/writes your data into Hadoop.
Thanks,
Kishore
On Tue, Sep 8, 2015 at 1:29 AM, <23...@gmail.com>> wrote:
Hi guys,
I am looking for pointers on migrating existing data warehouse to Hadoop. Currently, we are using IBM Data stage an ETL tool and loading into Teradata staging/maintain tables. Please suggest an architecture which reduces cost without much degrade in performance. Has anyone of you been a part of such migration before? If yes then please provide some inputs, especially on what aspects should we be taking care of. Talking about source data, it is mainly in the form of flat files and database.
Thanks in advance.
Regards,
Abhishek Singh
Re: ETL/DW to Hadoop migrations
Posted by Nagaraj Chandrashekar <nc...@innominds.com>.
Hello Abhishek,
I think you may find this white paper useful. This document talks about offloading Teradata with Hadoop. It also talks about capacity and savings costs using Hadoop solutions.
http://dataconomy.com/wp-content/uploads/2014/06/Syncsort-A-123ApproachtoTeradataOffloadwithHadoop.pdf
Cheers
Nagaraj C
Learn And Share! It's Big Data.
From: Sandesh Hegde <sa...@gmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Wednesday, September 9, 2015 at 1:24 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: ETL/DW to Hadoop migrations
Hello Abhishek,
Below is a link to Free data ingestion tool, dtIngest, this runs on Hadoop as Yarn app. Support various data sources.
Currently it doesn't have a support for Databases, future versions may have it. For database you can try Apache Sqoop.
https://www.datatorrent.com/product/datatorrent-dtingest/
https://www.datatorrent.com/dtingest-unified-streaming-batch-data-ingestion-hadoop/
Thanks
Sandesh
PS: I work for DataTorrent.
On Tue, Sep 8, 2015 at 9:57 AM, Abhishek Singh <23...@gmail.com>> wrote:
Hi Kishore,
Thanks for reverting. We are planning to do a POC in such a manner that we can replace Datastage. Datastage and Teradata are costly tools which is making a big hole in pocket. So, have you come across anything where ETL pipeline could be replaced with Hadoop? I understand about connectors which you are saying, but how about replacing an ETL tool?
Any links would do more than good.
Thanks once again.
Abhishek
On Tue, Sep 8, 2015 at 9:28 AM, Krishna Kishore Bonagiri <wr...@gmail.com>> wrote:
Abhishek,
Are you looking for loading your data into Hadoop? if yes, IBM DataStage has a stage called BDFS that loads/writes your data into Hadoop.
Thanks,
Kishore
On Tue, Sep 8, 2015 at 1:29 AM, <23...@gmail.com>> wrote:
Hi guys,
I am looking for pointers on migrating existing data warehouse to Hadoop. Currently, we are using IBM Data stage an ETL tool and loading into Teradata staging/maintain tables. Please suggest an architecture which reduces cost without much degrade in performance. Has anyone of you been a part of such migration before? If yes then please provide some inputs, especially on what aspects should we be taking care of. Talking about source data, it is mainly in the form of flat files and database.
Thanks in advance.
Regards,
Abhishek Singh
Re: ETL/DW to Hadoop migrations
Posted by Partho Bardhan <pa...@gmail.com>.
Hi Abhishek,
Indeed these tools are expensive and we see many customers opting for a lot
of open source projects to reduce license costs, prevent vendor lock-in and
have a steady group of software engineers making the projects more industry
friendly.
If you are looking at offloading from Datastage and Teradata, I could
suggest these open source projects
http://projects.spring.io/spring-xd/
Spring XD allows you to ingest data from traditional EDW, streams and flat
files. It is designed for parallel, high-speed ingest.
Another project allow you to perform SQL directly from flat files is
Pivotal Hawq. Pivotal Hawq is a SQL on Hadoop analytic engine. You can find
more about it on
http://pivotal.io/big-data/pivotal-hawq
If you would like to try out some of our Enterprise Grade as well as Beta
offerings, please look here: https://network.pivotal.io
Thanks
Partho Bardhan
Data Engineering
Pivotal
On Tue, Sep 8, 2015 at 2:54 PM, Sandesh Hegde <sa...@gmail.com>
wrote:
> Hello Abhishek,
>
> Below is a link to Free data ingestion tool, dtIngest, this runs on Hadoop
> as Yarn app. Support various data sources.
> Currently it doesn't have a support for Databases, future versions may
> have it. For database you can try Apache Sqoop.
>
> https://www.datatorrent.com/product/datatorrent-dtingest/
>
> https://www.datatorrent.com/dtingest-unified-streaming-batch-data-ingestion-hadoop/
>
> Thanks
> Sandesh
>
> PS: I work for DataTorrent.
>
> On Tue, Sep 8, 2015 at 9:57 AM, Abhishek Singh <23...@gmail.com>
> wrote:
>
>> Hi Kishore,
>>
>> Thanks for reverting. We are planning to do a POC in such a manner that
>> we can replace Datastage. Datastage and Teradata are costly tools which is
>> making a big hole in pocket. So, have you come across anything where ETL
>> pipeline could be replaced with Hadoop? I understand about connectors which
>> you are saying, but how about replacing an ETL tool?
>>
>> Any links would do more than good.
>>
>> Thanks once again.
>>
>> Abhishek
>>
>> On Tue, Sep 8, 2015 at 9:28 AM, Krishna Kishore Bonagiri <
>> write2kishore@gmail.com> wrote:
>>
>>> Abhishek,
>>>
>>> Are you looking for loading your data into Hadoop? if yes, IBM
>>> DataStage has a stage called BDFS that loads/writes your data into Hadoop.
>>>
>>> Thanks,
>>> Kishore
>>>
>>> On Tue, Sep 8, 2015 at 1:29 AM, <23...@gmail.com> wrote:
>>>
>>>> Hi guys,
>>>>
>>>> I am looking for pointers on migrating existing data warehouse to
>>>> Hadoop. Currently, we are using IBM Data stage an ETL tool and loading
>>>> into Teradata staging/maintain tables. Please suggest an architecture
>>>> which reduces cost without much degrade in performance. Has anyone of you
>>>> been a part of such migration before? If yes then please provide some
>>>> inputs, especially on what aspects should we be taking care of. Talking
>>>> about source data, it is mainly in the form of flat files and database.
>>>>
>>>> Thanks in advance.
>>>>
>>>> Regards,
>>>>
>>>> Abhishek Singh
>>>>
>>>
>>>
>>
>
Re: ETL/DW to Hadoop migrations
Posted by Partho Bardhan <pa...@gmail.com>.
Hi Abhishek,
Indeed these tools are expensive and we see many customers opting for a lot
of open source projects to reduce license costs, prevent vendor lock-in and
have a steady group of software engineers making the projects more industry
friendly.
If you are looking at offloading from Datastage and Teradata, I could
suggest these open source projects
http://projects.spring.io/spring-xd/
Spring XD allows you to ingest data from traditional EDW, streams and flat
files. It is designed for parallel, high-speed ingest.
Another project allow you to perform SQL directly from flat files is
Pivotal Hawq. Pivotal Hawq is a SQL on Hadoop analytic engine. You can find
more about it on
http://pivotal.io/big-data/pivotal-hawq
If you would like to try out some of our Enterprise Grade as well as Beta
offerings, please look here: https://network.pivotal.io
Thanks
Partho Bardhan
Data Engineering
Pivotal
On Tue, Sep 8, 2015 at 2:54 PM, Sandesh Hegde <sa...@gmail.com>
wrote:
> Hello Abhishek,
>
> Below is a link to Free data ingestion tool, dtIngest, this runs on Hadoop
> as Yarn app. Support various data sources.
> Currently it doesn't have a support for Databases, future versions may
> have it. For database you can try Apache Sqoop.
>
> https://www.datatorrent.com/product/datatorrent-dtingest/
>
> https://www.datatorrent.com/dtingest-unified-streaming-batch-data-ingestion-hadoop/
>
> Thanks
> Sandesh
>
> PS: I work for DataTorrent.
>
> On Tue, Sep 8, 2015 at 9:57 AM, Abhishek Singh <23...@gmail.com>
> wrote:
>
>> Hi Kishore,
>>
>> Thanks for reverting. We are planning to do a POC in such a manner that
>> we can replace Datastage. Datastage and Teradata are costly tools which is
>> making a big hole in pocket. So, have you come across anything where ETL
>> pipeline could be replaced with Hadoop? I understand about connectors which
>> you are saying, but how about replacing an ETL tool?
>>
>> Any links would do more than good.
>>
>> Thanks once again.
>>
>> Abhishek
>>
>> On Tue, Sep 8, 2015 at 9:28 AM, Krishna Kishore Bonagiri <
>> write2kishore@gmail.com> wrote:
>>
>>> Abhishek,
>>>
>>> Are you looking for loading your data into Hadoop? if yes, IBM
>>> DataStage has a stage called BDFS that loads/writes your data into Hadoop.
>>>
>>> Thanks,
>>> Kishore
>>>
>>> On Tue, Sep 8, 2015 at 1:29 AM, <23...@gmail.com> wrote:
>>>
>>>> Hi guys,
>>>>
>>>> I am looking for pointers on migrating existing data warehouse to
>>>> Hadoop. Currently, we are using IBM Data stage an ETL tool and loading
>>>> into Teradata staging/maintain tables. Please suggest an architecture
>>>> which reduces cost without much degrade in performance. Has anyone of you
>>>> been a part of such migration before? If yes then please provide some
>>>> inputs, especially on what aspects should we be taking care of. Talking
>>>> about source data, it is mainly in the form of flat files and database.
>>>>
>>>> Thanks in advance.
>>>>
>>>> Regards,
>>>>
>>>> Abhishek Singh
>>>>
>>>
>>>
>>
>
Re: ETL/DW to Hadoop migrations
Posted by Partho Bardhan <pa...@gmail.com>.
Hi Abhishek,
Indeed these tools are expensive and we see many customers opting for a lot
of open source projects to reduce license costs, prevent vendor lock-in and
have a steady group of software engineers making the projects more industry
friendly.
If you are looking at offloading from Datastage and Teradata, I could
suggest these open source projects
http://projects.spring.io/spring-xd/
Spring XD allows you to ingest data from traditional EDW, streams and flat
files. It is designed for parallel, high-speed ingest.
Another project allow you to perform SQL directly from flat files is
Pivotal Hawq. Pivotal Hawq is a SQL on Hadoop analytic engine. You can find
more about it on
http://pivotal.io/big-data/pivotal-hawq
If you would like to try out some of our Enterprise Grade as well as Beta
offerings, please look here: https://network.pivotal.io
Thanks
Partho Bardhan
Data Engineering
Pivotal
On Tue, Sep 8, 2015 at 2:54 PM, Sandesh Hegde <sa...@gmail.com>
wrote:
> Hello Abhishek,
>
> Below is a link to Free data ingestion tool, dtIngest, this runs on Hadoop
> as Yarn app. Support various data sources.
> Currently it doesn't have a support for Databases, future versions may
> have it. For database you can try Apache Sqoop.
>
> https://www.datatorrent.com/product/datatorrent-dtingest/
>
> https://www.datatorrent.com/dtingest-unified-streaming-batch-data-ingestion-hadoop/
>
> Thanks
> Sandesh
>
> PS: I work for DataTorrent.
>
> On Tue, Sep 8, 2015 at 9:57 AM, Abhishek Singh <23...@gmail.com>
> wrote:
>
>> Hi Kishore,
>>
>> Thanks for reverting. We are planning to do a POC in such a manner that
>> we can replace Datastage. Datastage and Teradata are costly tools which is
>> making a big hole in pocket. So, have you come across anything where ETL
>> pipeline could be replaced with Hadoop? I understand about connectors which
>> you are saying, but how about replacing an ETL tool?
>>
>> Any links would do more than good.
>>
>> Thanks once again.
>>
>> Abhishek
>>
>> On Tue, Sep 8, 2015 at 9:28 AM, Krishna Kishore Bonagiri <
>> write2kishore@gmail.com> wrote:
>>
>>> Abhishek,
>>>
>>> Are you looking for loading your data into Hadoop? if yes, IBM
>>> DataStage has a stage called BDFS that loads/writes your data into Hadoop.
>>>
>>> Thanks,
>>> Kishore
>>>
>>> On Tue, Sep 8, 2015 at 1:29 AM, <23...@gmail.com> wrote:
>>>
>>>> Hi guys,
>>>>
>>>> I am looking for pointers on migrating existing data warehouse to
>>>> Hadoop. Currently, we are using IBM Data stage an ETL tool and loading
>>>> into Teradata staging/maintain tables. Please suggest an architecture
>>>> which reduces cost without much degrade in performance. Has anyone of you
>>>> been a part of such migration before? If yes then please provide some
>>>> inputs, especially on what aspects should we be taking care of. Talking
>>>> about source data, it is mainly in the form of flat files and database.
>>>>
>>>> Thanks in advance.
>>>>
>>>> Regards,
>>>>
>>>> Abhishek Singh
>>>>
>>>
>>>
>>
>
Re: ETL/DW to Hadoop migrations
Posted by Nagaraj Chandrashekar <nc...@innominds.com>.
Hello Abhishek,
I think you may find this white paper useful. This document talks about offloading Teradata with Hadoop. It also talks about capacity and savings costs using Hadoop solutions.
http://dataconomy.com/wp-content/uploads/2014/06/Syncsort-A-123ApproachtoTeradataOffloadwithHadoop.pdf
Cheers
Nagaraj C
Learn And Share! It's Big Data.
From: Sandesh Hegde <sa...@gmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Wednesday, September 9, 2015 at 1:24 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: ETL/DW to Hadoop migrations
Hello Abhishek,
Below is a link to Free data ingestion tool, dtIngest, this runs on Hadoop as Yarn app. Support various data sources.
Currently it doesn't have a support for Databases, future versions may have it. For database you can try Apache Sqoop.
https://www.datatorrent.com/product/datatorrent-dtingest/
https://www.datatorrent.com/dtingest-unified-streaming-batch-data-ingestion-hadoop/
Thanks
Sandesh
PS: I work for DataTorrent.
On Tue, Sep 8, 2015 at 9:57 AM, Abhishek Singh <23...@gmail.com>> wrote:
Hi Kishore,
Thanks for reverting. We are planning to do a POC in such a manner that we can replace Datastage. Datastage and Teradata are costly tools which is making a big hole in pocket. So, have you come across anything where ETL pipeline could be replaced with Hadoop? I understand about connectors which you are saying, but how about replacing an ETL tool?
Any links would do more than good.
Thanks once again.
Abhishek
On Tue, Sep 8, 2015 at 9:28 AM, Krishna Kishore Bonagiri <wr...@gmail.com>> wrote:
Abhishek,
Are you looking for loading your data into Hadoop? if yes, IBM DataStage has a stage called BDFS that loads/writes your data into Hadoop.
Thanks,
Kishore
On Tue, Sep 8, 2015 at 1:29 AM, <23...@gmail.com>> wrote:
Hi guys,
I am looking for pointers on migrating existing data warehouse to Hadoop. Currently, we are using IBM Data stage an ETL tool and loading into Teradata staging/maintain tables. Please suggest an architecture which reduces cost without much degrade in performance. Has anyone of you been a part of such migration before? If yes then please provide some inputs, especially on what aspects should we be taking care of. Talking about source data, it is mainly in the form of flat files and database.
Thanks in advance.
Regards,
Abhishek Singh
Re: ETL/DW to Hadoop migrations
Posted by Partho Bardhan <pa...@gmail.com>.
Hi Abhishek,
Indeed these tools are expensive and we see many customers opting for a lot
of open source projects to reduce license costs, prevent vendor lock-in and
have a steady group of software engineers making the projects more industry
friendly.
If you are looking at offloading from Datastage and Teradata, I could
suggest these open source projects
http://projects.spring.io/spring-xd/
Spring XD allows you to ingest data from traditional EDW, streams and flat
files. It is designed for parallel, high-speed ingest.
Another project allow you to perform SQL directly from flat files is
Pivotal Hawq. Pivotal Hawq is a SQL on Hadoop analytic engine. You can find
more about it on
http://pivotal.io/big-data/pivotal-hawq
If you would like to try out some of our Enterprise Grade as well as Beta
offerings, please look here: https://network.pivotal.io
Thanks
Partho Bardhan
Data Engineering
Pivotal
On Tue, Sep 8, 2015 at 2:54 PM, Sandesh Hegde <sa...@gmail.com>
wrote:
> Hello Abhishek,
>
> Below is a link to Free data ingestion tool, dtIngest, this runs on Hadoop
> as Yarn app. Support various data sources.
> Currently it doesn't have a support for Databases, future versions may
> have it. For database you can try Apache Sqoop.
>
> https://www.datatorrent.com/product/datatorrent-dtingest/
>
> https://www.datatorrent.com/dtingest-unified-streaming-batch-data-ingestion-hadoop/
>
> Thanks
> Sandesh
>
> PS: I work for DataTorrent.
>
> On Tue, Sep 8, 2015 at 9:57 AM, Abhishek Singh <23...@gmail.com>
> wrote:
>
>> Hi Kishore,
>>
>> Thanks for reverting. We are planning to do a POC in such a manner that
>> we can replace Datastage. Datastage and Teradata are costly tools which is
>> making a big hole in pocket. So, have you come across anything where ETL
>> pipeline could be replaced with Hadoop? I understand about connectors which
>> you are saying, but how about replacing an ETL tool?
>>
>> Any links would do more than good.
>>
>> Thanks once again.
>>
>> Abhishek
>>
>> On Tue, Sep 8, 2015 at 9:28 AM, Krishna Kishore Bonagiri <
>> write2kishore@gmail.com> wrote:
>>
>>> Abhishek,
>>>
>>> Are you looking for loading your data into Hadoop? if yes, IBM
>>> DataStage has a stage called BDFS that loads/writes your data into Hadoop.
>>>
>>> Thanks,
>>> Kishore
>>>
>>> On Tue, Sep 8, 2015 at 1:29 AM, <23...@gmail.com> wrote:
>>>
>>>> Hi guys,
>>>>
>>>> I am looking for pointers on migrating existing data warehouse to
>>>> Hadoop. Currently, we are using IBM Data stage an ETL tool and loading
>>>> into Teradata staging/maintain tables. Please suggest an architecture
>>>> which reduces cost without much degrade in performance. Has anyone of you
>>>> been a part of such migration before? If yes then please provide some
>>>> inputs, especially on what aspects should we be taking care of. Talking
>>>> about source data, it is mainly in the form of flat files and database.
>>>>
>>>> Thanks in advance.
>>>>
>>>> Regards,
>>>>
>>>> Abhishek Singh
>>>>
>>>
>>>
>>
>
Re: ETL/DW to Hadoop migrations
Posted by Nagaraj Chandrashekar <nc...@innominds.com>.
Hello Abhishek,
I think you may find this white paper useful. This document talks about offloading Teradata with Hadoop. It also talks about capacity and savings costs using Hadoop solutions.
http://dataconomy.com/wp-content/uploads/2014/06/Syncsort-A-123ApproachtoTeradataOffloadwithHadoop.pdf
Cheers
Nagaraj C
Learn And Share! It's Big Data.
From: Sandesh Hegde <sa...@gmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Wednesday, September 9, 2015 at 1:24 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: ETL/DW to Hadoop migrations
Hello Abhishek,
Below is a link to Free data ingestion tool, dtIngest, this runs on Hadoop as Yarn app. Support various data sources.
Currently it doesn't have a support for Databases, future versions may have it. For database you can try Apache Sqoop.
https://www.datatorrent.com/product/datatorrent-dtingest/
https://www.datatorrent.com/dtingest-unified-streaming-batch-data-ingestion-hadoop/
Thanks
Sandesh
PS: I work for DataTorrent.
On Tue, Sep 8, 2015 at 9:57 AM, Abhishek Singh <23...@gmail.com>> wrote:
Hi Kishore,
Thanks for reverting. We are planning to do a POC in such a manner that we can replace Datastage. Datastage and Teradata are costly tools which is making a big hole in pocket. So, have you come across anything where ETL pipeline could be replaced with Hadoop? I understand about connectors which you are saying, but how about replacing an ETL tool?
Any links would do more than good.
Thanks once again.
Abhishek
On Tue, Sep 8, 2015 at 9:28 AM, Krishna Kishore Bonagiri <wr...@gmail.com>> wrote:
Abhishek,
Are you looking for loading your data into Hadoop? if yes, IBM DataStage has a stage called BDFS that loads/writes your data into Hadoop.
Thanks,
Kishore
On Tue, Sep 8, 2015 at 1:29 AM, <23...@gmail.com>> wrote:
Hi guys,
I am looking for pointers on migrating existing data warehouse to Hadoop. Currently, we are using IBM Data stage an ETL tool and loading into Teradata staging/maintain tables. Please suggest an architecture which reduces cost without much degrade in performance. Has anyone of you been a part of such migration before? If yes then please provide some inputs, especially on what aspects should we be taking care of. Talking about source data, it is mainly in the form of flat files and database.
Thanks in advance.
Regards,
Abhishek Singh
Re: ETL/DW to Hadoop migrations
Posted by Sandesh Hegde <sa...@gmail.com>.
Hello Abhishek,
Below is a link to Free data ingestion tool, dtIngest, this runs on Hadoop
as Yarn app. Support various data sources.
Currently it doesn't have a support for Databases, future versions may have
it. For database you can try Apache Sqoop.
https://www.datatorrent.com/product/datatorrent-dtingest/
https://www.datatorrent.com/dtingest-unified-streaming-batch-data-ingestion-hadoop/
Thanks
Sandesh
PS: I work for DataTorrent.
On Tue, Sep 8, 2015 at 9:57 AM, Abhishek Singh <23...@gmail.com>
wrote:
> Hi Kishore,
>
> Thanks for reverting. We are planning to do a POC in such a manner that we
> can replace Datastage. Datastage and Teradata are costly tools which is
> making a big hole in pocket. So, have you come across anything where ETL
> pipeline could be replaced with Hadoop? I understand about connectors which
> you are saying, but how about replacing an ETL tool?
>
> Any links would do more than good.
>
> Thanks once again.
>
> Abhishek
>
> On Tue, Sep 8, 2015 at 9:28 AM, Krishna Kishore Bonagiri <
> write2kishore@gmail.com> wrote:
>
>> Abhishek,
>>
>> Are you looking for loading your data into Hadoop? if yes, IBM
>> DataStage has a stage called BDFS that loads/writes your data into Hadoop.
>>
>> Thanks,
>> Kishore
>>
>> On Tue, Sep 8, 2015 at 1:29 AM, <23...@gmail.com> wrote:
>>
>>> Hi guys,
>>>
>>> I am looking for pointers on migrating existing data warehouse to
>>> Hadoop. Currently, we are using IBM Data stage an ETL tool and loading
>>> into Teradata staging/maintain tables. Please suggest an architecture
>>> which reduces cost without much degrade in performance. Has anyone of you
>>> been a part of such migration before? If yes then please provide some
>>> inputs, especially on what aspects should we be taking care of. Talking
>>> about source data, it is mainly in the form of flat files and database.
>>>
>>> Thanks in advance.
>>>
>>> Regards,
>>>
>>> Abhishek Singh
>>>
>>
>>
>
Re: ETL/DW to Hadoop migrations
Posted by Sandesh Hegde <sa...@gmail.com>.
Hello Abhishek,
Below is a link to Free data ingestion tool, dtIngest, this runs on Hadoop
as Yarn app. Support various data sources.
Currently it doesn't have a support for Databases, future versions may have
it. For database you can try Apache Sqoop.
https://www.datatorrent.com/product/datatorrent-dtingest/
https://www.datatorrent.com/dtingest-unified-streaming-batch-data-ingestion-hadoop/
Thanks
Sandesh
PS: I work for DataTorrent.
On Tue, Sep 8, 2015 at 9:57 AM, Abhishek Singh <23...@gmail.com>
wrote:
> Hi Kishore,
>
> Thanks for reverting. We are planning to do a POC in such a manner that we
> can replace Datastage. Datastage and Teradata are costly tools which is
> making a big hole in pocket. So, have you come across anything where ETL
> pipeline could be replaced with Hadoop? I understand about connectors which
> you are saying, but how about replacing an ETL tool?
>
> Any links would do more than good.
>
> Thanks once again.
>
> Abhishek
>
> On Tue, Sep 8, 2015 at 9:28 AM, Krishna Kishore Bonagiri <
> write2kishore@gmail.com> wrote:
>
>> Abhishek,
>>
>> Are you looking for loading your data into Hadoop? if yes, IBM
>> DataStage has a stage called BDFS that loads/writes your data into Hadoop.
>>
>> Thanks,
>> Kishore
>>
>> On Tue, Sep 8, 2015 at 1:29 AM, <23...@gmail.com> wrote:
>>
>>> Hi guys,
>>>
>>> I am looking for pointers on migrating existing data warehouse to
>>> Hadoop. Currently, we are using IBM Data stage an ETL tool and loading
>>> into Teradata staging/maintain tables. Please suggest an architecture
>>> which reduces cost without much degrade in performance. Has anyone of you
>>> been a part of such migration before? If yes then please provide some
>>> inputs, especially on what aspects should we be taking care of. Talking
>>> about source data, it is mainly in the form of flat files and database.
>>>
>>> Thanks in advance.
>>>
>>> Regards,
>>>
>>> Abhishek Singh
>>>
>>
>>
>
Re: ETL/DW to Hadoop migrations
Posted by Sandesh Hegde <sa...@gmail.com>.
Hello Abhishek,
Below is a link to Free data ingestion tool, dtIngest, this runs on Hadoop
as Yarn app. Support various data sources.
Currently it doesn't have a support for Databases, future versions may have
it. For database you can try Apache Sqoop.
https://www.datatorrent.com/product/datatorrent-dtingest/
https://www.datatorrent.com/dtingest-unified-streaming-batch-data-ingestion-hadoop/
Thanks
Sandesh
PS: I work for DataTorrent.
On Tue, Sep 8, 2015 at 9:57 AM, Abhishek Singh <23...@gmail.com>
wrote:
> Hi Kishore,
>
> Thanks for reverting. We are planning to do a POC in such a manner that we
> can replace Datastage. Datastage and Teradata are costly tools which is
> making a big hole in pocket. So, have you come across anything where ETL
> pipeline could be replaced with Hadoop? I understand about connectors which
> you are saying, but how about replacing an ETL tool?
>
> Any links would do more than good.
>
> Thanks once again.
>
> Abhishek
>
> On Tue, Sep 8, 2015 at 9:28 AM, Krishna Kishore Bonagiri <
> write2kishore@gmail.com> wrote:
>
>> Abhishek,
>>
>> Are you looking for loading your data into Hadoop? if yes, IBM
>> DataStage has a stage called BDFS that loads/writes your data into Hadoop.
>>
>> Thanks,
>> Kishore
>>
>> On Tue, Sep 8, 2015 at 1:29 AM, <23...@gmail.com> wrote:
>>
>>> Hi guys,
>>>
>>> I am looking for pointers on migrating existing data warehouse to
>>> Hadoop. Currently, we are using IBM Data stage an ETL tool and loading
>>> into Teradata staging/maintain tables. Please suggest an architecture
>>> which reduces cost without much degrade in performance. Has anyone of you
>>> been a part of such migration before? If yes then please provide some
>>> inputs, especially on what aspects should we be taking care of. Talking
>>> about source data, it is mainly in the form of flat files and database.
>>>
>>> Thanks in advance.
>>>
>>> Regards,
>>>
>>> Abhishek Singh
>>>
>>
>>
>
Re: ETL/DW to Hadoop migrations
Posted by Sandesh Hegde <sa...@gmail.com>.
Hello Abhishek,
Below is a link to Free data ingestion tool, dtIngest, this runs on Hadoop
as Yarn app. Support various data sources.
Currently it doesn't have a support for Databases, future versions may have
it. For database you can try Apache Sqoop.
https://www.datatorrent.com/product/datatorrent-dtingest/
https://www.datatorrent.com/dtingest-unified-streaming-batch-data-ingestion-hadoop/
Thanks
Sandesh
PS: I work for DataTorrent.
On Tue, Sep 8, 2015 at 9:57 AM, Abhishek Singh <23...@gmail.com>
wrote:
> Hi Kishore,
>
> Thanks for reverting. We are planning to do a POC in such a manner that we
> can replace Datastage. Datastage and Teradata are costly tools which is
> making a big hole in pocket. So, have you come across anything where ETL
> pipeline could be replaced with Hadoop? I understand about connectors which
> you are saying, but how about replacing an ETL tool?
>
> Any links would do more than good.
>
> Thanks once again.
>
> Abhishek
>
> On Tue, Sep 8, 2015 at 9:28 AM, Krishna Kishore Bonagiri <
> write2kishore@gmail.com> wrote:
>
>> Abhishek,
>>
>> Are you looking for loading your data into Hadoop? if yes, IBM
>> DataStage has a stage called BDFS that loads/writes your data into Hadoop.
>>
>> Thanks,
>> Kishore
>>
>> On Tue, Sep 8, 2015 at 1:29 AM, <23...@gmail.com> wrote:
>>
>>> Hi guys,
>>>
>>> I am looking for pointers on migrating existing data warehouse to
>>> Hadoop. Currently, we are using IBM Data stage an ETL tool and loading
>>> into Teradata staging/maintain tables. Please suggest an architecture
>>> which reduces cost without much degrade in performance. Has anyone of you
>>> been a part of such migration before? If yes then please provide some
>>> inputs, especially on what aspects should we be taking care of. Talking
>>> about source data, it is mainly in the form of flat files and database.
>>>
>>> Thanks in advance.
>>>
>>> Regards,
>>>
>>> Abhishek Singh
>>>
>>
>>
>
Re: ETL/DW to Hadoop migrations
Posted by Abhishek Singh <23...@gmail.com>.
Hi Kishore,
Thanks for reverting. We are planning to do a POC in such a manner that we
can replace Datastage. Datastage and Teradata are costly tools which is
making a big hole in pocket. So, have you come across anything where ETL
pipeline could be replaced with Hadoop? I understand about connectors which
you are saying, but how about replacing an ETL tool?
Any links would do more than good.
Thanks once again.
Abhishek
On Tue, Sep 8, 2015 at 9:28 AM, Krishna Kishore Bonagiri <
write2kishore@gmail.com> wrote:
> Abhishek,
>
> Are you looking for loading your data into Hadoop? if yes, IBM
> DataStage has a stage called BDFS that loads/writes your data into Hadoop.
>
> Thanks,
> Kishore
>
> On Tue, Sep 8, 2015 at 1:29 AM, <23...@gmail.com> wrote:
>
>> Hi guys,
>>
>> I am looking for pointers on migrating existing data warehouse to
>> Hadoop. Currently, we are using IBM Data stage an ETL tool and loading
>> into Teradata staging/maintain tables. Please suggest an architecture
>> which reduces cost without much degrade in performance. Has anyone of you
>> been a part of such migration before? If yes then please provide some
>> inputs, especially on what aspects should we be taking care of. Talking
>> about source data, it is mainly in the form of flat files and database.
>>
>> Thanks in advance.
>>
>> Regards,
>>
>> Abhishek Singh
>>
>
>
Re: ETL/DW to Hadoop migrations
Posted by Abhishek Singh <23...@gmail.com>.
Hi Kishore,
Thanks for reverting. We are planning to do a POC in such a manner that we
can replace Datastage. Datastage and Teradata are costly tools which is
making a big hole in pocket. So, have you come across anything where ETL
pipeline could be replaced with Hadoop? I understand about connectors which
you are saying, but how about replacing an ETL tool?
Any links would do more than good.
Thanks once again.
Abhishek
On Tue, Sep 8, 2015 at 9:28 AM, Krishna Kishore Bonagiri <
write2kishore@gmail.com> wrote:
> Abhishek,
>
> Are you looking for loading your data into Hadoop? if yes, IBM
> DataStage has a stage called BDFS that loads/writes your data into Hadoop.
>
> Thanks,
> Kishore
>
> On Tue, Sep 8, 2015 at 1:29 AM, <23...@gmail.com> wrote:
>
>> Hi guys,
>>
>> I am looking for pointers on migrating existing data warehouse to
>> Hadoop. Currently, we are using IBM Data stage an ETL tool and loading
>> into Teradata staging/maintain tables. Please suggest an architecture
>> which reduces cost without much degrade in performance. Has anyone of you
>> been a part of such migration before? If yes then please provide some
>> inputs, especially on what aspects should we be taking care of. Talking
>> about source data, it is mainly in the form of flat files and database.
>>
>> Thanks in advance.
>>
>> Regards,
>>
>> Abhishek Singh
>>
>
>
Re: ETL/DW to Hadoop migrations
Posted by Abhishek Singh <23...@gmail.com>.
Hi Kishore,
Thanks for reverting. We are planning to do a POC in such a manner that we
can replace Datastage. Datastage and Teradata are costly tools which is
making a big hole in pocket. So, have you come across anything where ETL
pipeline could be replaced with Hadoop? I understand about connectors which
you are saying, but how about replacing an ETL tool?
Any links would do more than good.
Thanks once again.
Abhishek
On Tue, Sep 8, 2015 at 9:28 AM, Krishna Kishore Bonagiri <
write2kishore@gmail.com> wrote:
> Abhishek,
>
> Are you looking for loading your data into Hadoop? if yes, IBM
> DataStage has a stage called BDFS that loads/writes your data into Hadoop.
>
> Thanks,
> Kishore
>
> On Tue, Sep 8, 2015 at 1:29 AM, <23...@gmail.com> wrote:
>
>> Hi guys,
>>
>> I am looking for pointers on migrating existing data warehouse to
>> Hadoop. Currently, we are using IBM Data stage an ETL tool and loading
>> into Teradata staging/maintain tables. Please suggest an architecture
>> which reduces cost without much degrade in performance. Has anyone of you
>> been a part of such migration before? If yes then please provide some
>> inputs, especially on what aspects should we be taking care of. Talking
>> about source data, it is mainly in the form of flat files and database.
>>
>> Thanks in advance.
>>
>> Regards,
>>
>> Abhishek Singh
>>
>
>
Re: ETL/DW to Hadoop migrations
Posted by Abhishek Singh <23...@gmail.com>.
Hi Kishore,
Thanks for reverting. We are planning to do a POC in such a manner that we
can replace Datastage. Datastage and Teradata are costly tools which is
making a big hole in pocket. So, have you come across anything where ETL
pipeline could be replaced with Hadoop? I understand about connectors which
you are saying, but how about replacing an ETL tool?
Any links would do more than good.
Thanks once again.
Abhishek
On Tue, Sep 8, 2015 at 9:28 AM, Krishna Kishore Bonagiri <
write2kishore@gmail.com> wrote:
> Abhishek,
>
> Are you looking for loading your data into Hadoop? if yes, IBM
> DataStage has a stage called BDFS that loads/writes your data into Hadoop.
>
> Thanks,
> Kishore
>
> On Tue, Sep 8, 2015 at 1:29 AM, <23...@gmail.com> wrote:
>
>> Hi guys,
>>
>> I am looking for pointers on migrating existing data warehouse to
>> Hadoop. Currently, we are using IBM Data stage an ETL tool and loading
>> into Teradata staging/maintain tables. Please suggest an architecture
>> which reduces cost without much degrade in performance. Has anyone of you
>> been a part of such migration before? If yes then please provide some
>> inputs, especially on what aspects should we be taking care of. Talking
>> about source data, it is mainly in the form of flat files and database.
>>
>> Thanks in advance.
>>
>> Regards,
>>
>> Abhishek Singh
>>
>
>
Re: ETL/DW to Hadoop migrations
Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
Abhishek,
Are you looking for loading your data into Hadoop? if yes, IBM DataStage
has a stage called BDFS that loads/writes your data into Hadoop.
Thanks,
Kishore
On Tue, Sep 8, 2015 at 1:29 AM, <23...@gmail.com> wrote:
> Hi guys,
>
> I am looking for pointers on migrating existing data warehouse to Hadoop.
> Currently, we are using IBM Data stage an ETL tool and loading into
> Teradata staging/maintain tables. Please suggest an architecture which
> reduces cost without much degrade in performance. Has anyone of you been a
> part of such migration before? If yes then please provide some inputs,
> especially on what aspects should we be taking care of. Talking about
> source data, it is mainly in the form of flat files and database.
>
> Thanks in advance.
>
> Regards,
>
> Abhishek Singh
>
Re: ETL/DW to Hadoop migrations
Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
Abhishek,
Are you looking for loading your data into Hadoop? if yes, IBM DataStage
has a stage called BDFS that loads/writes your data into Hadoop.
Thanks,
Kishore
On Tue, Sep 8, 2015 at 1:29 AM, <23...@gmail.com> wrote:
> Hi guys,
>
> I am looking for pointers on migrating existing data warehouse to Hadoop.
> Currently, we are using IBM Data stage an ETL tool and loading into
> Teradata staging/maintain tables. Please suggest an architecture which
> reduces cost without much degrade in performance. Has anyone of you been a
> part of such migration before? If yes then please provide some inputs,
> especially on what aspects should we be taking care of. Talking about
> source data, it is mainly in the form of flat files and database.
>
> Thanks in advance.
>
> Regards,
>
> Abhishek Singh
>
Re: ETL/DW to Hadoop migrations
Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
Abhishek,
Are you looking for loading your data into Hadoop? if yes, IBM DataStage
has a stage called BDFS that loads/writes your data into Hadoop.
Thanks,
Kishore
On Tue, Sep 8, 2015 at 1:29 AM, <23...@gmail.com> wrote:
> Hi guys,
>
> I am looking for pointers on migrating existing data warehouse to Hadoop.
> Currently, we are using IBM Data stage an ETL tool and loading into
> Teradata staging/maintain tables. Please suggest an architecture which
> reduces cost without much degrade in performance. Has anyone of you been a
> part of such migration before? If yes then please provide some inputs,
> especially on what aspects should we be taking care of. Talking about
> source data, it is mainly in the form of flat files and database.
>
> Thanks in advance.
>
> Regards,
>
> Abhishek Singh
>
Re: ETL/DW to Hadoop migrations
Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
Abhishek,
Are you looking for loading your data into Hadoop? if yes, IBM DataStage
has a stage called BDFS that loads/writes your data into Hadoop.
Thanks,
Kishore
On Tue, Sep 8, 2015 at 1:29 AM, <23...@gmail.com> wrote:
> Hi guys,
>
> I am looking for pointers on migrating existing data warehouse to Hadoop.
> Currently, we are using IBM Data stage an ETL tool and loading into
> Teradata staging/maintain tables. Please suggest an architecture which
> reduces cost without much degrade in performance. Has anyone of you been a
> part of such migration before? If yes then please provide some inputs,
> especially on what aspects should we be taking care of. Talking about
> source data, it is mainly in the form of flat files and database.
>
> Thanks in advance.
>
> Regards,
>
> Abhishek Singh
>