You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Jitendra Yadav <je...@gmail.com> on 2013/10/08 17:47:01 UTC

Migrating from Legacy to Hadoop.

Hi All,

We are planning to consolidate our 3 existing warehouse databases to
Hadoop cluster, In our testing phase we have designed the target
environment and transferred the data from source to target (not in
sync but almost completed ). These legacy systems were using
traditional ETL/replication  mechanism like Golden gate, Loaders,
PL/SQL language etc., FYI we are using 80%  PL/SQL code and SQL server
packages in current environment, however we have re-writing some of
ETL jobs through java and python MR but looking for some more and easy
alternatives.

What is best approach we should follow to complete this process? while
suggesting please take effort and timing in consideration( if
possible).

Please guide.

Regards
Jitendra

Re: Migrating from Legacy to Hadoop.

Posted by Frank <fh...@comcast.net>.
Low cost alternative ETL is syncsort DMX-h ETL which extends hadoop MapReduce

Sent from my iPad

On Oct 8, 2013, at 10:16 PM, Peyman Mohajerian <mo...@gmail.com> wrote:

> I wonder if JDBC driver over Hive could help you. If you legacy ETL job can talk to a jdbc driver, it is a slow way of writing to HDFS and I don't have any experience doing it, e.g.:
> http://doc.cloveretl.com/documentation/UserGuide/index.jsp?topic=/com.cloveretl.gui.docs/docs/hive-connection.html
> 
> 
> On Tue, Oct 8, 2013 at 10:07 AM, Jitendra Yadav <je...@gmail.com> wrote:
>> Hi Bertrand,
>> 
>> Thanks for your reply.
>> 
>> As per my understanding mentioned open source does not support
>> procedure language(PL) flexibility. Right?
>> 
>> I was looking for some other alternatives so that we can migrate our
>> existing code rather then creating java UDF etc. So handling complex
>> ETL business logic is still very difficult on Hadoop in terms of
>> coding, QA and performance?
>> 
>> 
>> Regards
>> Jitendra
>> 
>> On 10/8/13, Bertrand Dechoux <de...@gmail.com> wrote:
>> > open source : Pig, Hive, Cascading ...
>> > other : Talend ...
>> >
>> > Is that the answer you are expecting or are you looking for something more
>> > specific?
>> >
>> > Regards
>> >
>> > Bertrand
>> >
>> >
>> > On Tue, Oct 8, 2013 at 5:47 PM, Jitendra Yadav
>> > <je...@gmail.com>wrote:
>> >
>> >> Hi All,
>> >>
>> >> We are planning to consolidate our 3 existing warehouse databases to
>> >> Hadoop cluster, In our testing phase we have designed the target
>> >> environment and transferred the data from source to target (not in
>> >> sync but almost completed ). These legacy systems were using
>> >> traditional ETL/replication  mechanism like Golden gate, Loaders,
>> >> PL/SQL language etc., FYI we are using 80%  PL/SQL code and SQL server
>> >> packages in current environment, however we have re-writing some of
>> >> ETL jobs through java and python MR but looking for some more and easy
>> >> alternatives.
>> >>
>> >> What is best approach we should follow to complete this process? while
>> >> suggesting please take effort and timing in consideration( if
>> >> possible).
>> >>
>> >> Please guide.
>> >>
>> >> Regards
>> >> Jitendra
>> >>
>> >
> 

Re: Migrating from Legacy to Hadoop.

Posted by Frank <fh...@comcast.net>.
Low cost alternative ETL is syncsort DMX-h ETL which extends hadoop MapReduce

Sent from my iPad

On Oct 8, 2013, at 10:16 PM, Peyman Mohajerian <mo...@gmail.com> wrote:

> I wonder if JDBC driver over Hive could help you. If you legacy ETL job can talk to a jdbc driver, it is a slow way of writing to HDFS and I don't have any experience doing it, e.g.:
> http://doc.cloveretl.com/documentation/UserGuide/index.jsp?topic=/com.cloveretl.gui.docs/docs/hive-connection.html
> 
> 
> On Tue, Oct 8, 2013 at 10:07 AM, Jitendra Yadav <je...@gmail.com> wrote:
>> Hi Bertrand,
>> 
>> Thanks for your reply.
>> 
>> As per my understanding mentioned open source does not support
>> procedure language(PL) flexibility. Right?
>> 
>> I was looking for some other alternatives so that we can migrate our
>> existing code rather then creating java UDF etc. So handling complex
>> ETL business logic is still very difficult on Hadoop in terms of
>> coding, QA and performance?
>> 
>> 
>> Regards
>> Jitendra
>> 
>> On 10/8/13, Bertrand Dechoux <de...@gmail.com> wrote:
>> > open source : Pig, Hive, Cascading ...
>> > other : Talend ...
>> >
>> > Is that the answer you are expecting or are you looking for something more
>> > specific?
>> >
>> > Regards
>> >
>> > Bertrand
>> >
>> >
>> > On Tue, Oct 8, 2013 at 5:47 PM, Jitendra Yadav
>> > <je...@gmail.com>wrote:
>> >
>> >> Hi All,
>> >>
>> >> We are planning to consolidate our 3 existing warehouse databases to
>> >> Hadoop cluster, In our testing phase we have designed the target
>> >> environment and transferred the data from source to target (not in
>> >> sync but almost completed ). These legacy systems were using
>> >> traditional ETL/replication  mechanism like Golden gate, Loaders,
>> >> PL/SQL language etc., FYI we are using 80%  PL/SQL code and SQL server
>> >> packages in current environment, however we have re-writing some of
>> >> ETL jobs through java and python MR but looking for some more and easy
>> >> alternatives.
>> >>
>> >> What is best approach we should follow to complete this process? while
>> >> suggesting please take effort and timing in consideration( if
>> >> possible).
>> >>
>> >> Please guide.
>> >>
>> >> Regards
>> >> Jitendra
>> >>
>> >
> 

Re: Migrating from Legacy to Hadoop.

Posted by Frank <fh...@comcast.net>.
Low cost alternative ETL is syncsort DMX-h ETL which extends hadoop MapReduce

Sent from my iPad

On Oct 8, 2013, at 10:16 PM, Peyman Mohajerian <mo...@gmail.com> wrote:

> I wonder if JDBC driver over Hive could help you. If you legacy ETL job can talk to a jdbc driver, it is a slow way of writing to HDFS and I don't have any experience doing it, e.g.:
> http://doc.cloveretl.com/documentation/UserGuide/index.jsp?topic=/com.cloveretl.gui.docs/docs/hive-connection.html
> 
> 
> On Tue, Oct 8, 2013 at 10:07 AM, Jitendra Yadav <je...@gmail.com> wrote:
>> Hi Bertrand,
>> 
>> Thanks for your reply.
>> 
>> As per my understanding mentioned open source does not support
>> procedure language(PL) flexibility. Right?
>> 
>> I was looking for some other alternatives so that we can migrate our
>> existing code rather then creating java UDF etc. So handling complex
>> ETL business logic is still very difficult on Hadoop in terms of
>> coding, QA and performance?
>> 
>> 
>> Regards
>> Jitendra
>> 
>> On 10/8/13, Bertrand Dechoux <de...@gmail.com> wrote:
>> > open source : Pig, Hive, Cascading ...
>> > other : Talend ...
>> >
>> > Is that the answer you are expecting or are you looking for something more
>> > specific?
>> >
>> > Regards
>> >
>> > Bertrand
>> >
>> >
>> > On Tue, Oct 8, 2013 at 5:47 PM, Jitendra Yadav
>> > <je...@gmail.com>wrote:
>> >
>> >> Hi All,
>> >>
>> >> We are planning to consolidate our 3 existing warehouse databases to
>> >> Hadoop cluster, In our testing phase we have designed the target
>> >> environment and transferred the data from source to target (not in
>> >> sync but almost completed ). These legacy systems were using
>> >> traditional ETL/replication  mechanism like Golden gate, Loaders,
>> >> PL/SQL language etc., FYI we are using 80%  PL/SQL code and SQL server
>> >> packages in current environment, however we have re-writing some of
>> >> ETL jobs through java and python MR but looking for some more and easy
>> >> alternatives.
>> >>
>> >> What is best approach we should follow to complete this process? while
>> >> suggesting please take effort and timing in consideration( if
>> >> possible).
>> >>
>> >> Please guide.
>> >>
>> >> Regards
>> >> Jitendra
>> >>
>> >
> 

Re: Migrating from Legacy to Hadoop.

Posted by Frank <fh...@comcast.net>.
Low cost alternative ETL is syncsort DMX-h ETL which extends hadoop MapReduce

Sent from my iPad

On Oct 8, 2013, at 10:16 PM, Peyman Mohajerian <mo...@gmail.com> wrote:

> I wonder if JDBC driver over Hive could help you. If you legacy ETL job can talk to a jdbc driver, it is a slow way of writing to HDFS and I don't have any experience doing it, e.g.:
> http://doc.cloveretl.com/documentation/UserGuide/index.jsp?topic=/com.cloveretl.gui.docs/docs/hive-connection.html
> 
> 
> On Tue, Oct 8, 2013 at 10:07 AM, Jitendra Yadav <je...@gmail.com> wrote:
>> Hi Bertrand,
>> 
>> Thanks for your reply.
>> 
>> As per my understanding mentioned open source does not support
>> procedure language(PL) flexibility. Right?
>> 
>> I was looking for some other alternatives so that we can migrate our
>> existing code rather then creating java UDF etc. So handling complex
>> ETL business logic is still very difficult on Hadoop in terms of
>> coding, QA and performance?
>> 
>> 
>> Regards
>> Jitendra
>> 
>> On 10/8/13, Bertrand Dechoux <de...@gmail.com> wrote:
>> > open source : Pig, Hive, Cascading ...
>> > other : Talend ...
>> >
>> > Is that the answer you are expecting or are you looking for something more
>> > specific?
>> >
>> > Regards
>> >
>> > Bertrand
>> >
>> >
>> > On Tue, Oct 8, 2013 at 5:47 PM, Jitendra Yadav
>> > <je...@gmail.com>wrote:
>> >
>> >> Hi All,
>> >>
>> >> We are planning to consolidate our 3 existing warehouse databases to
>> >> Hadoop cluster, In our testing phase we have designed the target
>> >> environment and transferred the data from source to target (not in
>> >> sync but almost completed ). These legacy systems were using
>> >> traditional ETL/replication  mechanism like Golden gate, Loaders,
>> >> PL/SQL language etc., FYI we are using 80%  PL/SQL code and SQL server
>> >> packages in current environment, however we have re-writing some of
>> >> ETL jobs through java and python MR but looking for some more and easy
>> >> alternatives.
>> >>
>> >> What is best approach we should follow to complete this process? while
>> >> suggesting please take effort and timing in consideration( if
>> >> possible).
>> >>
>> >> Please guide.
>> >>
>> >> Regards
>> >> Jitendra
>> >>
>> >
> 

Re: Migrating from Legacy to Hadoop.

Posted by Peyman Mohajerian <mo...@gmail.com>.
I wonder if JDBC driver over Hive could help you. If you legacy ETL job can
talk to a jdbc driver, it is a slow way of writing to HDFS and I don't have
any experience doing it, e.g.:
http://doc.cloveretl.com/documentation/UserGuide/index.jsp?topic=/com.cloveretl.gui.docs/docs/hive-connection.html


On Tue, Oct 8, 2013 at 10:07 AM, Jitendra Yadav
<je...@gmail.com>wrote:

> Hi Bertrand,
>
> Thanks for your reply.
>
> As per my understanding mentioned open source does not support
> procedure language(PL) flexibility. Right?
>
> I was looking for some other alternatives so that we can migrate our
> existing code rather then creating java UDF etc. So handling complex
> ETL business logic is still very difficult on Hadoop in terms of
> coding, QA and performance?
>
>
> Regards
> Jitendra
>
> On 10/8/13, Bertrand Dechoux <de...@gmail.com> wrote:
> > open source : Pig, Hive, Cascading ...
> > other : Talend ...
> >
> > Is that the answer you are expecting or are you looking for something
> more
> > specific?
> >
> > Regards
> >
> > Bertrand
> >
> >
> > On Tue, Oct 8, 2013 at 5:47 PM, Jitendra Yadav
> > <je...@gmail.com>wrote:
> >
> >> Hi All,
> >>
> >> We are planning to consolidate our 3 existing warehouse databases to
> >> Hadoop cluster, In our testing phase we have designed the target
> >> environment and transferred the data from source to target (not in
> >> sync but almost completed ). These legacy systems were using
> >> traditional ETL/replication  mechanism like Golden gate, Loaders,
> >> PL/SQL language etc., FYI we are using 80%  PL/SQL code and SQL server
> >> packages in current environment, however we have re-writing some of
> >> ETL jobs through java and python MR but looking for some more and easy
> >> alternatives.
> >>
> >> What is best approach we should follow to complete this process? while
> >> suggesting please take effort and timing in consideration( if
> >> possible).
> >>
> >> Please guide.
> >>
> >> Regards
> >> Jitendra
> >>
> >
>

Re: Migrating from Legacy to Hadoop.

Posted by Peyman Mohajerian <mo...@gmail.com>.
I wonder if JDBC driver over Hive could help you. If you legacy ETL job can
talk to a jdbc driver, it is a slow way of writing to HDFS and I don't have
any experience doing it, e.g.:
http://doc.cloveretl.com/documentation/UserGuide/index.jsp?topic=/com.cloveretl.gui.docs/docs/hive-connection.html


On Tue, Oct 8, 2013 at 10:07 AM, Jitendra Yadav
<je...@gmail.com>wrote:

> Hi Bertrand,
>
> Thanks for your reply.
>
> As per my understanding mentioned open source does not support
> procedure language(PL) flexibility. Right?
>
> I was looking for some other alternatives so that we can migrate our
> existing code rather then creating java UDF etc. So handling complex
> ETL business logic is still very difficult on Hadoop in terms of
> coding, QA and performance?
>
>
> Regards
> Jitendra
>
> On 10/8/13, Bertrand Dechoux <de...@gmail.com> wrote:
> > open source : Pig, Hive, Cascading ...
> > other : Talend ...
> >
> > Is that the answer you are expecting or are you looking for something
> more
> > specific?
> >
> > Regards
> >
> > Bertrand
> >
> >
> > On Tue, Oct 8, 2013 at 5:47 PM, Jitendra Yadav
> > <je...@gmail.com>wrote:
> >
> >> Hi All,
> >>
> >> We are planning to consolidate our 3 existing warehouse databases to
> >> Hadoop cluster, In our testing phase we have designed the target
> >> environment and transferred the data from source to target (not in
> >> sync but almost completed ). These legacy systems were using
> >> traditional ETL/replication  mechanism like Golden gate, Loaders,
> >> PL/SQL language etc., FYI we are using 80%  PL/SQL code and SQL server
> >> packages in current environment, however we have re-writing some of
> >> ETL jobs through java and python MR but looking for some more and easy
> >> alternatives.
> >>
> >> What is best approach we should follow to complete this process? while
> >> suggesting please take effort and timing in consideration( if
> >> possible).
> >>
> >> Please guide.
> >>
> >> Regards
> >> Jitendra
> >>
> >
>

Re: Migrating from Legacy to Hadoop.

Posted by Peyman Mohajerian <mo...@gmail.com>.
I wonder if JDBC driver over Hive could help you. If you legacy ETL job can
talk to a jdbc driver, it is a slow way of writing to HDFS and I don't have
any experience doing it, e.g.:
http://doc.cloveretl.com/documentation/UserGuide/index.jsp?topic=/com.cloveretl.gui.docs/docs/hive-connection.html


On Tue, Oct 8, 2013 at 10:07 AM, Jitendra Yadav
<je...@gmail.com>wrote:

> Hi Bertrand,
>
> Thanks for your reply.
>
> As per my understanding mentioned open source does not support
> procedure language(PL) flexibility. Right?
>
> I was looking for some other alternatives so that we can migrate our
> existing code rather then creating java UDF etc. So handling complex
> ETL business logic is still very difficult on Hadoop in terms of
> coding, QA and performance?
>
>
> Regards
> Jitendra
>
> On 10/8/13, Bertrand Dechoux <de...@gmail.com> wrote:
> > open source : Pig, Hive, Cascading ...
> > other : Talend ...
> >
> > Is that the answer you are expecting or are you looking for something
> more
> > specific?
> >
> > Regards
> >
> > Bertrand
> >
> >
> > On Tue, Oct 8, 2013 at 5:47 PM, Jitendra Yadav
> > <je...@gmail.com>wrote:
> >
> >> Hi All,
> >>
> >> We are planning to consolidate our 3 existing warehouse databases to
> >> Hadoop cluster, In our testing phase we have designed the target
> >> environment and transferred the data from source to target (not in
> >> sync but almost completed ). These legacy systems were using
> >> traditional ETL/replication  mechanism like Golden gate, Loaders,
> >> PL/SQL language etc., FYI we are using 80%  PL/SQL code and SQL server
> >> packages in current environment, however we have re-writing some of
> >> ETL jobs through java and python MR but looking for some more and easy
> >> alternatives.
> >>
> >> What is best approach we should follow to complete this process? while
> >> suggesting please take effort and timing in consideration( if
> >> possible).
> >>
> >> Please guide.
> >>
> >> Regards
> >> Jitendra
> >>
> >
>

Re: Migrating from Legacy to Hadoop.

Posted by Peyman Mohajerian <mo...@gmail.com>.
I wonder if JDBC driver over Hive could help you. If you legacy ETL job can
talk to a jdbc driver, it is a slow way of writing to HDFS and I don't have
any experience doing it, e.g.:
http://doc.cloveretl.com/documentation/UserGuide/index.jsp?topic=/com.cloveretl.gui.docs/docs/hive-connection.html


On Tue, Oct 8, 2013 at 10:07 AM, Jitendra Yadav
<je...@gmail.com>wrote:

> Hi Bertrand,
>
> Thanks for your reply.
>
> As per my understanding mentioned open source does not support
> procedure language(PL) flexibility. Right?
>
> I was looking for some other alternatives so that we can migrate our
> existing code rather then creating java UDF etc. So handling complex
> ETL business logic is still very difficult on Hadoop in terms of
> coding, QA and performance?
>
>
> Regards
> Jitendra
>
> On 10/8/13, Bertrand Dechoux <de...@gmail.com> wrote:
> > open source : Pig, Hive, Cascading ...
> > other : Talend ...
> >
> > Is that the answer you are expecting or are you looking for something
> more
> > specific?
> >
> > Regards
> >
> > Bertrand
> >
> >
> > On Tue, Oct 8, 2013 at 5:47 PM, Jitendra Yadav
> > <je...@gmail.com>wrote:
> >
> >> Hi All,
> >>
> >> We are planning to consolidate our 3 existing warehouse databases to
> >> Hadoop cluster, In our testing phase we have designed the target
> >> environment and transferred the data from source to target (not in
> >> sync but almost completed ). These legacy systems were using
> >> traditional ETL/replication  mechanism like Golden gate, Loaders,
> >> PL/SQL language etc., FYI we are using 80%  PL/SQL code and SQL server
> >> packages in current environment, however we have re-writing some of
> >> ETL jobs through java and python MR but looking for some more and easy
> >> alternatives.
> >>
> >> What is best approach we should follow to complete this process? while
> >> suggesting please take effort and timing in consideration( if
> >> possible).
> >>
> >> Please guide.
> >>
> >> Regards
> >> Jitendra
> >>
> >
>

Re: Migrating from Legacy to Hadoop.

Posted by Jitendra Yadav <je...@gmail.com>.
Hi Bertrand,

Thanks for your reply.

As per my understanding mentioned open source does not support
procedure language(PL) flexibility. Right?

I was looking for some other alternatives so that we can migrate our
existing code rather then creating java UDF etc. So handling complex
ETL business logic is still very difficult on Hadoop in terms of
coding, QA and performance?


Regards
Jitendra

On 10/8/13, Bertrand Dechoux <de...@gmail.com> wrote:
> open source : Pig, Hive, Cascading ...
> other : Talend ...
>
> Is that the answer you are expecting or are you looking for something more
> specific?
>
> Regards
>
> Bertrand
>
>
> On Tue, Oct 8, 2013 at 5:47 PM, Jitendra Yadav
> <je...@gmail.com>wrote:
>
>> Hi All,
>>
>> We are planning to consolidate our 3 existing warehouse databases to
>> Hadoop cluster, In our testing phase we have designed the target
>> environment and transferred the data from source to target (not in
>> sync but almost completed ). These legacy systems were using
>> traditional ETL/replication  mechanism like Golden gate, Loaders,
>> PL/SQL language etc., FYI we are using 80%  PL/SQL code and SQL server
>> packages in current environment, however we have re-writing some of
>> ETL jobs through java and python MR but looking for some more and easy
>> alternatives.
>>
>> What is best approach we should follow to complete this process? while
>> suggesting please take effort and timing in consideration( if
>> possible).
>>
>> Please guide.
>>
>> Regards
>> Jitendra
>>
>

Re: Migrating from Legacy to Hadoop.

Posted by Jitendra Yadav <je...@gmail.com>.
Hi Bertrand,

Thanks for your reply.

As per my understanding mentioned open source does not support
procedure language(PL) flexibility. Right?

I was looking for some other alternatives so that we can migrate our
existing code rather then creating java UDF etc. So handling complex
ETL business logic is still very difficult on Hadoop in terms of
coding, QA and performance?


Regards
Jitendra

On 10/8/13, Bertrand Dechoux <de...@gmail.com> wrote:
> open source : Pig, Hive, Cascading ...
> other : Talend ...
>
> Is that the answer you are expecting or are you looking for something more
> specific?
>
> Regards
>
> Bertrand
>
>
> On Tue, Oct 8, 2013 at 5:47 PM, Jitendra Yadav
> <je...@gmail.com>wrote:
>
>> Hi All,
>>
>> We are planning to consolidate our 3 existing warehouse databases to
>> Hadoop cluster, In our testing phase we have designed the target
>> environment and transferred the data from source to target (not in
>> sync but almost completed ). These legacy systems were using
>> traditional ETL/replication  mechanism like Golden gate, Loaders,
>> PL/SQL language etc., FYI we are using 80%  PL/SQL code and SQL server
>> packages in current environment, however we have re-writing some of
>> ETL jobs through java and python MR but looking for some more and easy
>> alternatives.
>>
>> What is best approach we should follow to complete this process? while
>> suggesting please take effort and timing in consideration( if
>> possible).
>>
>> Please guide.
>>
>> Regards
>> Jitendra
>>
>

Re: Migrating from Legacy to Hadoop.

Posted by Jitendra Yadav <je...@gmail.com>.
Hi Bertrand,

Thanks for your reply.

As per my understanding mentioned open source does not support
procedure language(PL) flexibility. Right?

I was looking for some other alternatives so that we can migrate our
existing code rather then creating java UDF etc. So handling complex
ETL business logic is still very difficult on Hadoop in terms of
coding, QA and performance?


Regards
Jitendra

On 10/8/13, Bertrand Dechoux <de...@gmail.com> wrote:
> open source : Pig, Hive, Cascading ...
> other : Talend ...
>
> Is that the answer you are expecting or are you looking for something more
> specific?
>
> Regards
>
> Bertrand
>
>
> On Tue, Oct 8, 2013 at 5:47 PM, Jitendra Yadav
> <je...@gmail.com>wrote:
>
>> Hi All,
>>
>> We are planning to consolidate our 3 existing warehouse databases to
>> Hadoop cluster, In our testing phase we have designed the target
>> environment and transferred the data from source to target (not in
>> sync but almost completed ). These legacy systems were using
>> traditional ETL/replication  mechanism like Golden gate, Loaders,
>> PL/SQL language etc., FYI we are using 80%  PL/SQL code and SQL server
>> packages in current environment, however we have re-writing some of
>> ETL jobs through java and python MR but looking for some more and easy
>> alternatives.
>>
>> What is best approach we should follow to complete this process? while
>> suggesting please take effort and timing in consideration( if
>> possible).
>>
>> Please guide.
>>
>> Regards
>> Jitendra
>>
>

Re: Migrating from Legacy to Hadoop.

Posted by Jitendra Yadav <je...@gmail.com>.
Hi Bertrand,

Thanks for your reply.

As per my understanding mentioned open source does not support
procedure language(PL) flexibility. Right?

I was looking for some other alternatives so that we can migrate our
existing code rather then creating java UDF etc. So handling complex
ETL business logic is still very difficult on Hadoop in terms of
coding, QA and performance?


Regards
Jitendra

On 10/8/13, Bertrand Dechoux <de...@gmail.com> wrote:
> open source : Pig, Hive, Cascading ...
> other : Talend ...
>
> Is that the answer you are expecting or are you looking for something more
> specific?
>
> Regards
>
> Bertrand
>
>
> On Tue, Oct 8, 2013 at 5:47 PM, Jitendra Yadav
> <je...@gmail.com>wrote:
>
>> Hi All,
>>
>> We are planning to consolidate our 3 existing warehouse databases to
>> Hadoop cluster, In our testing phase we have designed the target
>> environment and transferred the data from source to target (not in
>> sync but almost completed ). These legacy systems were using
>> traditional ETL/replication  mechanism like Golden gate, Loaders,
>> PL/SQL language etc., FYI we are using 80%  PL/SQL code and SQL server
>> packages in current environment, however we have re-writing some of
>> ETL jobs through java and python MR but looking for some more and easy
>> alternatives.
>>
>> What is best approach we should follow to complete this process? while
>> suggesting please take effort and timing in consideration( if
>> possible).
>>
>> Please guide.
>>
>> Regards
>> Jitendra
>>
>

Re: Migrating from Legacy to Hadoop.

Posted by Bertrand Dechoux <de...@gmail.com>.
open source : Pig, Hive, Cascading ...
other : Talend ...

Is that the answer you are expecting or are you looking for something more
specific?

Regards

Bertrand


On Tue, Oct 8, 2013 at 5:47 PM, Jitendra Yadav
<je...@gmail.com>wrote:

> Hi All,
>
> We are planning to consolidate our 3 existing warehouse databases to
> Hadoop cluster, In our testing phase we have designed the target
> environment and transferred the data from source to target (not in
> sync but almost completed ). These legacy systems were using
> traditional ETL/replication  mechanism like Golden gate, Loaders,
> PL/SQL language etc., FYI we are using 80%  PL/SQL code and SQL server
> packages in current environment, however we have re-writing some of
> ETL jobs through java and python MR but looking for some more and easy
> alternatives.
>
> What is best approach we should follow to complete this process? while
> suggesting please take effort and timing in consideration( if
> possible).
>
> Please guide.
>
> Regards
> Jitendra
>

Re: Migrating from Legacy to Hadoop.

Posted by Bertrand Dechoux <de...@gmail.com>.
open source : Pig, Hive, Cascading ...
other : Talend ...

Is that the answer you are expecting or are you looking for something more
specific?

Regards

Bertrand


On Tue, Oct 8, 2013 at 5:47 PM, Jitendra Yadav
<je...@gmail.com>wrote:

> Hi All,
>
> We are planning to consolidate our 3 existing warehouse databases to
> Hadoop cluster, In our testing phase we have designed the target
> environment and transferred the data from source to target (not in
> sync but almost completed ). These legacy systems were using
> traditional ETL/replication  mechanism like Golden gate, Loaders,
> PL/SQL language etc., FYI we are using 80%  PL/SQL code and SQL server
> packages in current environment, however we have re-writing some of
> ETL jobs through java and python MR but looking for some more and easy
> alternatives.
>
> What is best approach we should follow to complete this process? while
> suggesting please take effort and timing in consideration( if
> possible).
>
> Please guide.
>
> Regards
> Jitendra
>

Re: Migrating from Legacy to Hadoop.

Posted by Bertrand Dechoux <de...@gmail.com>.
open source : Pig, Hive, Cascading ...
other : Talend ...

Is that the answer you are expecting or are you looking for something more
specific?

Regards

Bertrand


On Tue, Oct 8, 2013 at 5:47 PM, Jitendra Yadav
<je...@gmail.com>wrote:

> Hi All,
>
> We are planning to consolidate our 3 existing warehouse databases to
> Hadoop cluster, In our testing phase we have designed the target
> environment and transferred the data from source to target (not in
> sync but almost completed ). These legacy systems were using
> traditional ETL/replication  mechanism like Golden gate, Loaders,
> PL/SQL language etc., FYI we are using 80%  PL/SQL code and SQL server
> packages in current environment, however we have re-writing some of
> ETL jobs through java and python MR but looking for some more and easy
> alternatives.
>
> What is best approach we should follow to complete this process? while
> suggesting please take effort and timing in consideration( if
> possible).
>
> Please guide.
>
> Regards
> Jitendra
>

Re: Migrating from Legacy to Hadoop.

Posted by Bertrand Dechoux <de...@gmail.com>.
open source : Pig, Hive, Cascading ...
other : Talend ...

Is that the answer you are expecting or are you looking for something more
specific?

Regards

Bertrand


On Tue, Oct 8, 2013 at 5:47 PM, Jitendra Yadav
<je...@gmail.com>wrote:

> Hi All,
>
> We are planning to consolidate our 3 existing warehouse databases to
> Hadoop cluster, In our testing phase we have designed the target
> environment and transferred the data from source to target (not in
> sync but almost completed ). These legacy systems were using
> traditional ETL/replication  mechanism like Golden gate, Loaders,
> PL/SQL language etc., FYI we are using 80%  PL/SQL code and SQL server
> packages in current environment, however we have re-writing some of
> ETL jobs through java and python MR but looking for some more and easy
> alternatives.
>
> What is best approach we should follow to complete this process? while
> suggesting please take effort and timing in consideration( if
> possible).
>
> Please guide.
>
> Regards
> Jitendra
>