You are viewing a plain text version of this content. The canonical link for it is here.

Posted to hdfs-user@hadoop.apache.org by Siddharth Tiwari <si...@live.com> on 2012/08/28 18:24:11 UTC

Hadoop and MainFrame integration

Hi Users.

We have flat files on mainframes with around a billion records. We need to sort them and then use them with different jobs on mainframe for report generation. I was wondering was there any way I could integrate the mainframe with hadoop do the sorting and keep the file on the sever itself ( I do not want to ftp the file to a hadoop cluster and then ftp back the sorted file to Mainframe as it would waste MIPS and nullify the advantage ). This way I could save on MIPS and ultimately improve profitability. 

Thank you in advance

*------------------------*

Cheers !!!

Siddharth Tiwari

Have a refreshing day !!!
"Every duty is holy, and devotion to duty is the highest form of worship of God.” 

"Maybe other people will try to limit me but I don't limit myself"

Re: Hadoop and MainFrame integration

Posted by Mathias Herberts <ma...@gmail.com>.

build a custom transfer mechanism in Java and use a zaap so you won't
consume mips
On Aug 28, 2012 6:24 PM, "Siddharth Tiwari" <si...@live.com>
wrote:

>  Hi Users.
>
> We have flat files on mainframes with around a billion records. We need to
> sort them and then use them with different jobs on mainframe for report
> generation. I was wondering was there any way I could integrate the
> mainframe with hadoop do the sorting and keep the file on the sever itself
> ( I do not want to ftp the file to a hadoop cluster and then ftp back the
> sorted file to Mainframe as it would waste MIPS and nullify the advantage
> ). This way I could save on MIPS and ultimately improve profitability.
>
> Thank you in advance
>
>
> **------------------------**
> *Cheers !!!*
> *Siddharth Tiwari*
> Have a refreshing day !!!
> *"Every duty is holy, and devotion to duty is the highest form of worship
> of God.” *
> *"Maybe other people will try to limit me but I don't limit myself"*
>

Re: Hadoop and MainFrame integration

Posted by Steve Loughran <st...@hortonworks.com>.

On 28 August 2012 09:24, Siddharth Tiwari <si...@live.com> wrote:

>  Hi Users.
>
> We have flat files on mainframes with around a billion records. We need to
> sort them and then use them with different jobs on mainframe for report
> generation. I was wondering was there any way I could integrate the
> mainframe with hadoop do the sorting and keep the file on the sever itself
> ( I do not want to ftp the file to a hadoop cluster and then ftp back the
> sorted file to Mainframe as it would waste MIPS and nullify the advantage
> ). This way I could save on MIPS and ultimately improve profitability.
>
>
Can you NFS-mount the mainframe filesystem from the Hadoop cluster?
Otherwise, do you or your mainframe vendor have a custom Hadoop filesystem
binding for the mainframe?

If not, you should be able to use ftp:// URLs as the source of data for the
initial MR job; at the end of the sequence of MR jobs the result can go
back to the mainframe;

Re: Hadoop and MainFrame integration

Posted by Ankam Venkateshwarlu <an...@gmail.com>.

Can you please explain how to automate the process of sending files back
and forth from Mainframe?  Is it done using NDM?

I have a requirement to migrate Mainframe to Hadoop.  I am looking for more
information in this area about the economics, process and tools that
support migration etc.  Any information related to this is highly
appreciated.  Thanks!

Regards,
Venkat Ankam

On Tue, Aug 28, 2012 at 10:00 PM, modemide <mo...@gmail.com> wrote:

> At some point in the work flow you're going to have to transfer the file
> from the mainframe to the Hadoop cluster for processing, and then send it
> back for storage on the mainframe.
>
> You should be able to automate the process of sending the files back and
> forth.
>
> It's been my experience that it's often faster to process and sort large
> files on a Hadoop cluster even while factoring in the cost to transfer
> to/from the mainframe.
>
> Hopefully that answers your question.  If not, are you looking to actually
> use Hadoop to process files in place on the mainframe?  That concept
> conflicts with my understanding of Hadoop.
>
> On Tue, Aug 28, 2012 at 12:24 PM, Siddharth Tiwari <
> siddharth.tiwari@live.com> wrote:
>
>>  Hi Users.
>>
>> We have flat files on mainframes with around a billion records. We need
>> to sort them and then use them with different jobs on mainframe for report
>> generation. I was wondering was there any way I could integrate the
>> mainframe with hadoop do the sorting and keep the file on the sever itself
>> ( I do not want to ftp the file to a hadoop cluster and then ftp back the
>> sorted file to Mainframe as it would waste MIPS and nullify the advantage
>> ). This way I could save on MIPS and ultimately improve profitability.
>>
>> Thank you in advance
>>
>>
>> **------------------------**
>> *Cheers !!!*
>> *Siddharth Tiwari*
>> Have a refreshing day !!!
>> *"Every duty is holy, and devotion to duty is the highest form of
>> worship of God.” *
>> *"Maybe other people will try to limit me but I don't limit myself"*
>>
>
>

Re: Hadoop and MainFrame integration

Posted by Ankam Venkateshwarlu <an...@gmail.com>.

Can you please explain how to automate the process of sending files back
and forth from Mainframe?  Is it done using NDM?

I have a requirement to migrate Mainframe to Hadoop.  I am looking for more
information in this area about the economics, process and tools that
support migration etc.  Any information related to this is highly
appreciated.  Thanks!

Regards,
Venkat Ankam

On Tue, Aug 28, 2012 at 10:00 PM, modemide <mo...@gmail.com> wrote:

> At some point in the work flow you're going to have to transfer the file
> from the mainframe to the Hadoop cluster for processing, and then send it
> back for storage on the mainframe.
>
> You should be able to automate the process of sending the files back and
> forth.
>
> It's been my experience that it's often faster to process and sort large
> files on a Hadoop cluster even while factoring in the cost to transfer
> to/from the mainframe.
>
> Hopefully that answers your question.  If not, are you looking to actually
> use Hadoop to process files in place on the mainframe?  That concept
> conflicts with my understanding of Hadoop.
>
> On Tue, Aug 28, 2012 at 12:24 PM, Siddharth Tiwari <
> siddharth.tiwari@live.com> wrote:
>
>>  Hi Users.
>>
>> We have flat files on mainframes with around a billion records. We need
>> to sort them and then use them with different jobs on mainframe for report
>> generation. I was wondering was there any way I could integrate the
>> mainframe with hadoop do the sorting and keep the file on the sever itself
>> ( I do not want to ftp the file to a hadoop cluster and then ftp back the
>> sorted file to Mainframe as it would waste MIPS and nullify the advantage
>> ). This way I could save on MIPS and ultimately improve profitability.
>>
>> Thank you in advance
>>
>>
>> **------------------------**
>> *Cheers !!!*
>> *Siddharth Tiwari*
>> Have a refreshing day !!!
>> *"Every duty is holy, and devotion to duty is the highest form of
>> worship of God.” *
>> *"Maybe other people will try to limit me but I don't limit myself"*
>>
>
>

Re: Hadoop and MainFrame integration

Posted by Ankam Venkateshwarlu <an...@gmail.com>.

Can you please explain how to automate the process of sending files back
and forth from Mainframe?  Is it done using NDM?

I have a requirement to migrate Mainframe to Hadoop.  I am looking for more
information in this area about the economics, process and tools that
support migration etc.  Any information related to this is highly
appreciated.  Thanks!

Regards,
Venkat Ankam

On Tue, Aug 28, 2012 at 10:00 PM, modemide <mo...@gmail.com> wrote:

> At some point in the work flow you're going to have to transfer the file
> from the mainframe to the Hadoop cluster for processing, and then send it
> back for storage on the mainframe.
>
> You should be able to automate the process of sending the files back and
> forth.
>
> It's been my experience that it's often faster to process and sort large
> files on a Hadoop cluster even while factoring in the cost to transfer
> to/from the mainframe.
>
> Hopefully that answers your question.  If not, are you looking to actually
> use Hadoop to process files in place on the mainframe?  That concept
> conflicts with my understanding of Hadoop.
>
> On Tue, Aug 28, 2012 at 12:24 PM, Siddharth Tiwari <
> siddharth.tiwari@live.com> wrote:
>
>>  Hi Users.
>>
>> We have flat files on mainframes with around a billion records. We need
>> to sort them and then use them with different jobs on mainframe for report
>> generation. I was wondering was there any way I could integrate the
>> mainframe with hadoop do the sorting and keep the file on the sever itself
>> ( I do not want to ftp the file to a hadoop cluster and then ftp back the
>> sorted file to Mainframe as it would waste MIPS and nullify the advantage
>> ). This way I could save on MIPS and ultimately improve profitability.
>>
>> Thank you in advance
>>
>>
>> **------------------------**
>> *Cheers !!!*
>> *Siddharth Tiwari*
>> Have a refreshing day !!!
>> *"Every duty is holy, and devotion to duty is the highest form of
>> worship of God.” *
>> *"Maybe other people will try to limit me but I don't limit myself"*
>>
>
>

Re: Hadoop and MainFrame integration

Posted by Ankam Venkateshwarlu <an...@gmail.com>.

Can you please explain how to automate the process of sending files back
and forth from Mainframe?  Is it done using NDM?

I have a requirement to migrate Mainframe to Hadoop.  I am looking for more
information in this area about the economics, process and tools that
support migration etc.  Any information related to this is highly
appreciated.  Thanks!

Regards,
Venkat Ankam

On Tue, Aug 28, 2012 at 10:00 PM, modemide <mo...@gmail.com> wrote:

> At some point in the work flow you're going to have to transfer the file
> from the mainframe to the Hadoop cluster for processing, and then send it
> back for storage on the mainframe.
>
> You should be able to automate the process of sending the files back and
> forth.
>
> It's been my experience that it's often faster to process and sort large
> files on a Hadoop cluster even while factoring in the cost to transfer
> to/from the mainframe.
>
> Hopefully that answers your question.  If not, are you looking to actually
> use Hadoop to process files in place on the mainframe?  That concept
> conflicts with my understanding of Hadoop.
>
> On Tue, Aug 28, 2012 at 12:24 PM, Siddharth Tiwari <
> siddharth.tiwari@live.com> wrote:
>
>>  Hi Users.
>>
>> We have flat files on mainframes with around a billion records. We need
>> to sort them and then use them with different jobs on mainframe for report
>> generation. I was wondering was there any way I could integrate the
>> mainframe with hadoop do the sorting and keep the file on the sever itself
>> ( I do not want to ftp the file to a hadoop cluster and then ftp back the
>> sorted file to Mainframe as it would waste MIPS and nullify the advantage
>> ). This way I could save on MIPS and ultimately improve profitability.
>>
>> Thank you in advance
>>
>>
>> **------------------------**
>> *Cheers !!!*
>> *Siddharth Tiwari*
>> Have a refreshing day !!!
>> *"Every duty is holy, and devotion to duty is the highest form of
>> worship of God.” *
>> *"Maybe other people will try to limit me but I don't limit myself"*
>>
>
>

Re: Hadoop and MainFrame integration

Posted by modemide <mo...@gmail.com>.

At some point in the work flow you're going to have to transfer the file
from the mainframe to the Hadoop cluster for processing, and then send it
back for storage on the mainframe.

You should be able to automate the process of sending the files back and
forth.

It's been my experience that it's often faster to process and sort large
files on a Hadoop cluster even while factoring in the cost to transfer
to/from the mainframe.

Hopefully that answers your question.  If not, are you looking to actually
use Hadoop to process files in place on the mainframe?  That concept
conflicts with my understanding of Hadoop.

On Tue, Aug 28, 2012 at 12:24 PM, Siddharth Tiwari <
siddharth.tiwari@live.com> wrote:

>  Hi Users.
>
> We have flat files on mainframes with around a billion records. We need to
> sort them and then use them with different jobs on mainframe for report
> generation. I was wondering was there any way I could integrate the
> mainframe with hadoop do the sorting and keep the file on the sever itself
> ( I do not want to ftp the file to a hadoop cluster and then ftp back the
> sorted file to Mainframe as it would waste MIPS and nullify the advantage
> ). This way I could save on MIPS and ultimately improve profitability.
>
> Thank you in advance
>
>
> **------------------------**
> *Cheers !!!*
> *Siddharth Tiwari*
> Have a refreshing day !!!
> *"Every duty is holy, and devotion to duty is the highest form of worship
> of God.” *
> *"Maybe other people will try to limit me but I don't limit myself"*
>

Re: Hadoop and MainFrame integration

Posted by modemide <mo...@gmail.com>.

At some point in the work flow you're going to have to transfer the file
from the mainframe to the Hadoop cluster for processing, and then send it
back for storage on the mainframe.

You should be able to automate the process of sending the files back and
forth.

It's been my experience that it's often faster to process and sort large
files on a Hadoop cluster even while factoring in the cost to transfer
to/from the mainframe.

Hopefully that answers your question.  If not, are you looking to actually
use Hadoop to process files in place on the mainframe?  That concept
conflicts with my understanding of Hadoop.

On Tue, Aug 28, 2012 at 12:24 PM, Siddharth Tiwari <
siddharth.tiwari@live.com> wrote:

>  Hi Users.
>
> We have flat files on mainframes with around a billion records. We need to
> sort them and then use them with different jobs on mainframe for report
> generation. I was wondering was there any way I could integrate the
> mainframe with hadoop do the sorting and keep the file on the sever itself
> ( I do not want to ftp the file to a hadoop cluster and then ftp back the
> sorted file to Mainframe as it would waste MIPS and nullify the advantage
> ). This way I could save on MIPS and ultimately improve profitability.
>
> Thank you in advance
>
>
> **------------------------**
> *Cheers !!!*
> *Siddharth Tiwari*
> Have a refreshing day !!!
> *"Every duty is holy, and devotion to duty is the highest form of worship
> of God.” *
> *"Maybe other people will try to limit me but I don't limit myself"*
>

Re: Hadoop and MainFrame integration

Posted by Steve Loughran <st...@hortonworks.com>.

On 28 August 2012 09:24, Siddharth Tiwari <si...@live.com> wrote:

>  Hi Users.
>
> We have flat files on mainframes with around a billion records. We need to
> sort them and then use them with different jobs on mainframe for report
> generation. I was wondering was there any way I could integrate the
> mainframe with hadoop do the sorting and keep the file on the sever itself
> ( I do not want to ftp the file to a hadoop cluster and then ftp back the
> sorted file to Mainframe as it would waste MIPS and nullify the advantage
> ). This way I could save on MIPS and ultimately improve profitability.
>
>
Can you NFS-mount the mainframe filesystem from the Hadoop cluster?
Otherwise, do you or your mainframe vendor have a custom Hadoop filesystem
binding for the mainframe?

If not, you should be able to use ftp:// URLs as the source of data for the
initial MR job; at the end of the sequence of MR jobs the result can go
back to the mainframe;

Re: Hadoop and MainFrame integration

Posted by Steve Loughran <st...@hortonworks.com>.

On 28 August 2012 09:24, Siddharth Tiwari <si...@live.com> wrote:

>  Hi Users.
>
> We have flat files on mainframes with around a billion records. We need to
> sort them and then use them with different jobs on mainframe for report
> generation. I was wondering was there any way I could integrate the
> mainframe with hadoop do the sorting and keep the file on the sever itself
> ( I do not want to ftp the file to a hadoop cluster and then ftp back the
> sorted file to Mainframe as it would waste MIPS and nullify the advantage
> ). This way I could save on MIPS and ultimately improve profitability.
>
>
Can you NFS-mount the mainframe filesystem from the Hadoop cluster?
Otherwise, do you or your mainframe vendor have a custom Hadoop filesystem
binding for the mainframe?

If not, you should be able to use ftp:// URLs as the source of data for the
initial MR job; at the end of the sequence of MR jobs the result can go
back to the mainframe;

Re: Hadoop and MainFrame integration

Posted by Chris Smith <cs...@gmail.com>.

Siddharth,

I don't know if you've already found this video but here's Phillip Shelley
(Sears CTO) talking about Sears experience with Hadoop and their mainframe:
http://youtu.be/8Rztad665po

They use 'ftp' to move data to and from Hadoop, which is a bottleneck, but
they gained so many 'spare' MIPS and reduced the processing time
significantly that the MIPS burnt in the ftp probably fade ino
insignificance.

BTW, according to the talk their COBOL programmers found Pig as a good tool
to re-engineer some of their existing COBOL jobs.

It may not be a solution to your problem but it should give you some
confidence in using Hadoop to support the mainframe.

Regards,

Chris

On 29 August 2012 03:14, Artem Ervits <ar...@nyp.org> wrote:

>  Can you read the data off backup tapes and dump it to flat files?
>
>
> Artem Ervits
> Data Analyst
> New York Presbyterian Hospital
>
>  *From*: Marcos Ortiz [mailto:mlortiz@uci.cu]
> *Sent*: Tuesday, August 28, 2012 06:51 PM
> *To*: user@hadoop.apache.org <us...@hadoop.apache.org>
> *Cc*: Siddharth Tiwari <si...@live.com>
> *Subject*: Re: Hadoop and MainFrame integration
>
>  The problem with it, is that Hadoop depends on top of HDFS to storage in
> blocks of 64/128 MB of size (or the size that you determine, 64 MB is the
> de-facto size), and then make the calculations.
> So, you need to move all your data to a HDFS cluster to use data in
> MapReduce jobs if you want to make the calculations with Hadoop.
> Best wishes
>
> El 28/08/2012 12:24, Siddharth Tiwari escribió:
>
> Hi Users.
>
> We have flat files on mainframes with around a billion records. We need to
> sort them and then use them with different jobs on mainframe for report
> generation. I was wondering was there any way I could integrate the
> mainframe with hadoop do the sorting and keep the file on the sever itself
> ( I do not want to ftp the file to a hadoop cluster and then ftp back the
> sorted file to Mainframe as it would waste MIPS and nullify the advantage
> ). This way I could save on MIPS and ultimately improve profitability.
>
> Thank you in advance
>
>
> **------------------------**
> *Cheers !!!*
> *Siddharth Tiwari*
> Have a refreshing day !!!
> *"Every duty is holy, and devotion to duty is the highest form of worship
> of God.�€ *
> *"Maybe other people will try to limit me but I don't limit myself"*
>
>
>  <http://www.uci.cu/>
>
>
>
>
>
>  <http://www.uci.cu/>
> ------------------------------
> This electronic message is intended to be for the use only of the named
> recipient, and may contain information that is confidential or privileged.
> If you are not the intended recipient, you are hereby notified that any
> disclosure, copying, distribution or use of the contents of this message is
> strictly prohibited. If you have received this message in error or are not
> the named recipient, please notify us immediately by contacting the sender
> at the electronic mail address noted above, and delete and destroy all
> copies of this message. Thank you.
>
> --------------------
>
> This electronic message is intended to be for the use only of the named recipient, and may contain information that is confidential or privileged.  If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or use of the contents of this message is strictly prohibited.  If you have received this message in error or are not the named recipient, please notify us immediately by contacting the sender at the electronic mail address noted above, and delete and destroy all copies of this message.  Thank you.
>
>
> --------------------
>
> This electronic message is intended to be for the use only of the named recipient, and may contain information that is confidential or privileged.  If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or use of the contents of this message is strictly prohibited.  If you have received this message in error or are not the named recipient, please notify us immediately by contacting the sender at the electronic mail address noted above, and delete and destroy all copies of this message.  Thank you.
>
>
>
>

Re: Hadoop and MainFrame integration

Posted by Chris Smith <cs...@gmail.com>.

Siddharth,

I don't know if you've already found this video but here's Phillip Shelley
(Sears CTO) talking about Sears experience with Hadoop and their mainframe:
http://youtu.be/8Rztad665po

They use 'ftp' to move data to and from Hadoop, which is a bottleneck, but
they gained so many 'spare' MIPS and reduced the processing time
significantly that the MIPS burnt in the ftp probably fade ino
insignificance.

BTW, according to the talk their COBOL programmers found Pig as a good tool
to re-engineer some of their existing COBOL jobs.

It may not be a solution to your problem but it should give you some
confidence in using Hadoop to support the mainframe.

Regards,

Chris

On 29 August 2012 03:14, Artem Ervits <ar...@nyp.org> wrote:

>  Can you read the data off backup tapes and dump it to flat files?
>
>
> Artem Ervits
> Data Analyst
> New York Presbyterian Hospital
>
>  *From*: Marcos Ortiz [mailto:mlortiz@uci.cu]
> *Sent*: Tuesday, August 28, 2012 06:51 PM
> *To*: user@hadoop.apache.org <us...@hadoop.apache.org>
> *Cc*: Siddharth Tiwari <si...@live.com>
> *Subject*: Re: Hadoop and MainFrame integration
>
>  The problem with it, is that Hadoop depends on top of HDFS to storage in
> blocks of 64/128 MB of size (or the size that you determine, 64 MB is the
> de-facto size), and then make the calculations.
> So, you need to move all your data to a HDFS cluster to use data in
> MapReduce jobs if you want to make the calculations with Hadoop.
> Best wishes
>
> El 28/08/2012 12:24, Siddharth Tiwari escribió:
>
> Hi Users.
>
> We have flat files on mainframes with around a billion records. We need to
> sort them and then use them with different jobs on mainframe for report
> generation. I was wondering was there any way I could integrate the
> mainframe with hadoop do the sorting and keep the file on the sever itself
> ( I do not want to ftp the file to a hadoop cluster and then ftp back the
> sorted file to Mainframe as it would waste MIPS and nullify the advantage
> ). This way I could save on MIPS and ultimately improve profitability.
>
> Thank you in advance
>
>
> **------------------------**
> *Cheers !!!*
> *Siddharth Tiwari*
> Have a refreshing day !!!
> *"Every duty is holy, and devotion to duty is the highest form of worship
> of God.�€ *
> *"Maybe other people will try to limit me but I don't limit myself"*
>
>
>  <http://www.uci.cu/>
>
>
>
>
>
>  <http://www.uci.cu/>
> ------------------------------
> This electronic message is intended to be for the use only of the named
> recipient, and may contain information that is confidential or privileged.
> If you are not the intended recipient, you are hereby notified that any
> disclosure, copying, distribution or use of the contents of this message is
> strictly prohibited. If you have received this message in error or are not
> the named recipient, please notify us immediately by contacting the sender
> at the electronic mail address noted above, and delete and destroy all
> copies of this message. Thank you.
>
> --------------------
>
> This electronic message is intended to be for the use only of the named recipient, and may contain information that is confidential or privileged.  If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or use of the contents of this message is strictly prohibited.  If you have received this message in error or are not the named recipient, please notify us immediately by contacting the sender at the electronic mail address noted above, and delete and destroy all copies of this message.  Thank you.
>
>
> --------------------
>
> This electronic message is intended to be for the use only of the named recipient, and may contain information that is confidential or privileged.  If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or use of the contents of this message is strictly prohibited.  If you have received this message in error or are not the named recipient, please notify us immediately by contacting the sender at the electronic mail address noted above, and delete and destroy all copies of this message.  Thank you.
>
>
>
>

Re: Hadoop and MainFrame integration

Posted by Chris Smith <cs...@gmail.com>.

Siddharth,

I don't know if you've already found this video but here's Phillip Shelley
(Sears CTO) talking about Sears experience with Hadoop and their mainframe:
http://youtu.be/8Rztad665po

They use 'ftp' to move data to and from Hadoop, which is a bottleneck, but
they gained so many 'spare' MIPS and reduced the processing time
significantly that the MIPS burnt in the ftp probably fade ino
insignificance.

BTW, according to the talk their COBOL programmers found Pig as a good tool
to re-engineer some of their existing COBOL jobs.

It may not be a solution to your problem but it should give you some
confidence in using Hadoop to support the mainframe.

Regards,

Chris

On 29 August 2012 03:14, Artem Ervits <ar...@nyp.org> wrote:

>  Can you read the data off backup tapes and dump it to flat files?
>
>
> Artem Ervits
> Data Analyst
> New York Presbyterian Hospital
>
>  *From*: Marcos Ortiz [mailto:mlortiz@uci.cu]
> *Sent*: Tuesday, August 28, 2012 06:51 PM
> *To*: user@hadoop.apache.org <us...@hadoop.apache.org>
> *Cc*: Siddharth Tiwari <si...@live.com>
> *Subject*: Re: Hadoop and MainFrame integration
>
>  The problem with it, is that Hadoop depends on top of HDFS to storage in
> blocks of 64/128 MB of size (or the size that you determine, 64 MB is the
> de-facto size), and then make the calculations.
> So, you need to move all your data to a HDFS cluster to use data in
> MapReduce jobs if you want to make the calculations with Hadoop.
> Best wishes
>
> El 28/08/2012 12:24, Siddharth Tiwari escribió:
>
> Hi Users.
>
> We have flat files on mainframes with around a billion records. We need to
> sort them and then use them with different jobs on mainframe for report
> generation. I was wondering was there any way I could integrate the
> mainframe with hadoop do the sorting and keep the file on the sever itself
> ( I do not want to ftp the file to a hadoop cluster and then ftp back the
> sorted file to Mainframe as it would waste MIPS and nullify the advantage
> ). This way I could save on MIPS and ultimately improve profitability.
>
> Thank you in advance
>
>
> **------------------------**
> *Cheers !!!*
> *Siddharth Tiwari*
> Have a refreshing day !!!
> *"Every duty is holy, and devotion to duty is the highest form of worship
> of God.�€ *
> *"Maybe other people will try to limit me but I don't limit myself"*
>
>
>  <http://www.uci.cu/>
>
>
>
>
>
>  <http://www.uci.cu/>
> ------------------------------
> This electronic message is intended to be for the use only of the named
> recipient, and may contain information that is confidential or privileged.
> If you are not the intended recipient, you are hereby notified that any
> disclosure, copying, distribution or use of the contents of this message is
> strictly prohibited. If you have received this message in error or are not
> the named recipient, please notify us immediately by contacting the sender
> at the electronic mail address noted above, and delete and destroy all
> copies of this message. Thank you.
>
> --------------------
>
> This electronic message is intended to be for the use only of the named recipient, and may contain information that is confidential or privileged.  If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or use of the contents of this message is strictly prohibited.  If you have received this message in error or are not the named recipient, please notify us immediately by contacting the sender at the electronic mail address noted above, and delete and destroy all copies of this message.  Thank you.
>
>
> --------------------
>
> This electronic message is intended to be for the use only of the named recipient, and may contain information that is confidential or privileged.  If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or use of the contents of this message is strictly prohibited.  If you have received this message in error or are not the named recipient, please notify us immediately by contacting the sender at the electronic mail address noted above, and delete and destroy all copies of this message.  Thank you.
>
>
>
>

Re: Hadoop and MainFrame integration

Posted by Chris Smith <cs...@gmail.com>.

Siddharth,

I don't know if you've already found this video but here's Phillip Shelley
(Sears CTO) talking about Sears experience with Hadoop and their mainframe:
http://youtu.be/8Rztad665po

They use 'ftp' to move data to and from Hadoop, which is a bottleneck, but
they gained so many 'spare' MIPS and reduced the processing time
significantly that the MIPS burnt in the ftp probably fade ino
insignificance.

BTW, according to the talk their COBOL programmers found Pig as a good tool
to re-engineer some of their existing COBOL jobs.

It may not be a solution to your problem but it should give you some
confidence in using Hadoop to support the mainframe.

Regards,

Chris

On 29 August 2012 03:14, Artem Ervits <ar...@nyp.org> wrote:

>  Can you read the data off backup tapes and dump it to flat files?
>
>
> Artem Ervits
> Data Analyst
> New York Presbyterian Hospital
>
>  *From*: Marcos Ortiz [mailto:mlortiz@uci.cu]
> *Sent*: Tuesday, August 28, 2012 06:51 PM
> *To*: user@hadoop.apache.org <us...@hadoop.apache.org>
> *Cc*: Siddharth Tiwari <si...@live.com>
> *Subject*: Re: Hadoop and MainFrame integration
>
>  The problem with it, is that Hadoop depends on top of HDFS to storage in
> blocks of 64/128 MB of size (or the size that you determine, 64 MB is the
> de-facto size), and then make the calculations.
> So, you need to move all your data to a HDFS cluster to use data in
> MapReduce jobs if you want to make the calculations with Hadoop.
> Best wishes
>
> El 28/08/2012 12:24, Siddharth Tiwari escribió:
>
> Hi Users.
>
> We have flat files on mainframes with around a billion records. We need to
> sort them and then use them with different jobs on mainframe for report
> generation. I was wondering was there any way I could integrate the
> mainframe with hadoop do the sorting and keep the file on the sever itself
> ( I do not want to ftp the file to a hadoop cluster and then ftp back the
> sorted file to Mainframe as it would waste MIPS and nullify the advantage
> ). This way I could save on MIPS and ultimately improve profitability.
>
> Thank you in advance
>
>
> **------------------------**
> *Cheers !!!*
> *Siddharth Tiwari*
> Have a refreshing day !!!
> *"Every duty is holy, and devotion to duty is the highest form of worship
> of God.�€ *
> *"Maybe other people will try to limit me but I don't limit myself"*
>
>
>  <http://www.uci.cu/>
>
>
>
>
>
>  <http://www.uci.cu/>
> ------------------------------
> This electronic message is intended to be for the use only of the named
> recipient, and may contain information that is confidential or privileged.
> If you are not the intended recipient, you are hereby notified that any
> disclosure, copying, distribution or use of the contents of this message is
> strictly prohibited. If you have received this message in error or are not
> the named recipient, please notify us immediately by contacting the sender
> at the electronic mail address noted above, and delete and destroy all
> copies of this message. Thank you.
>
> --------------------
>
> This electronic message is intended to be for the use only of the named recipient, and may contain information that is confidential or privileged.  If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or use of the contents of this message is strictly prohibited.  If you have received this message in error or are not the named recipient, please notify us immediately by contacting the sender at the electronic mail address noted above, and delete and destroy all copies of this message.  Thank you.
>
>
> --------------------
>
> This electronic message is intended to be for the use only of the named recipient, and may contain information that is confidential or privileged.  If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or use of the contents of this message is strictly prohibited.  If you have received this message in error or are not the named recipient, please notify us immediately by contacting the sender at the electronic mail address noted above, and delete and destroy all copies of this message.  Thank you.
>
>
>
>

Re: Hadoop and MainFrame integration

Posted by Artem Ervits <ar...@nyp.org>.

Can you read the data off backup tapes and dump it to flat files?

Artem Ervits
Data Analyst
New York Presbyterian Hospital

From: Marcos Ortiz [mailto:mlortiz@uci.cu]
Sent: Tuesday, August 28, 2012 06:51 PM
To: user@hadoop.apache.org <us...@hadoop.apache.org>
Cc: Siddharth Tiwari <si...@live.com>
Subject: Re: Hadoop and MainFrame integration

The problem with it, is that Hadoop depends on top of HDFS to storage in blocks of 64/128 MB of size (or the size that you determine, 64 MB is the de-facto size), and then make the calculations.
So, you need to move all your data to a HDFS cluster to use data in MapReduce jobs if you want to make the calculations with Hadoop.
Best wishes

El 28/08/2012 12:24, Siddharth Tiwari escribió:
Hi Users.

We have flat files on mainframes with around a billion records. We need to sort them and then use them with different jobs on mainframe for report generation. I was wondering was there any way I could integrate the mainframe with hadoop do the sorting and keep the file on the sever itself ( I do not want to ftp the file to a hadoop cluster and then ftp back the sorted file to Mainframe as it would waste MIPS and nullify the advantage ). This way I could save on MIPS and ultimately improve profitability.

Thank you in advance

*------------------------*
Cheers !!!
Siddharth Tiwari
Have a refreshing day !!!
"Every duty is holy, and devotion to duty is the highest form of worship of God.�
"Maybe other people will try to limit me but I don't limit myself"

[http://universidad.uci.cu/email.gif]
<http://www.uci.cu/>

[http://universidad.uci.cu/email.gif]
<http://www.uci.cu/>
________________________________
This electronic message is intended to be for the use only of the named recipient, and may contain information that is confidential or privileged. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or use of the contents of this message is strictly prohibited. If you have received this message in error or are not the named recipient, please notify us immediately by contacting the sender at the electronic mail address noted above, and delete and destroy all copies of this message. Thank you.

--------------------

This electronic message is intended to be for the use only of the named recipient, and may contain information that is confidential or privileged.  If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or use of the contents of this message is strictly prohibited.  If you have received this message in error or are not the named recipient, please notify us immediately by contacting the sender at the electronic mail address noted above, and delete and destroy all copies of this message.  Thank you.

--------------------

This electronic message is intended to be for the use only of the named recipient, and may contain information that is confidential or privileged.  If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or use of the contents of this message is strictly prohibited.  If you have received this message in error or are not the named recipient, please notify us immediately by contacting the sender at the electronic mail address noted above, and delete and destroy all copies of this message.  Thank you.

Re: Hadoop and MainFrame integration

Posted by Artem Ervits <ar...@nyp.org>.

Can you read the data off backup tapes and dump it to flat files?

Artem Ervits
Data Analyst
New York Presbyterian Hospital

From: Marcos Ortiz [mailto:mlortiz@uci.cu]
Sent: Tuesday, August 28, 2012 06:51 PM
To: user@hadoop.apache.org <us...@hadoop.apache.org>
Cc: Siddharth Tiwari <si...@live.com>
Subject: Re: Hadoop and MainFrame integration

The problem with it, is that Hadoop depends on top of HDFS to storage in blocks of 64/128 MB of size (or the size that you determine, 64 MB is the de-facto size), and then make the calculations.
So, you need to move all your data to a HDFS cluster to use data in MapReduce jobs if you want to make the calculations with Hadoop.
Best wishes

El 28/08/2012 12:24, Siddharth Tiwari escribió:
Hi Users.

We have flat files on mainframes with around a billion records. We need to sort them and then use them with different jobs on mainframe for report generation. I was wondering was there any way I could integrate the mainframe with hadoop do the sorting and keep the file on the sever itself ( I do not want to ftp the file to a hadoop cluster and then ftp back the sorted file to Mainframe as it would waste MIPS and nullify the advantage ). This way I could save on MIPS and ultimately improve profitability.

Thank you in advance

*------------------------*
Cheers !!!
Siddharth Tiwari
Have a refreshing day !!!
"Every duty is holy, and devotion to duty is the highest form of worship of God.�
"Maybe other people will try to limit me but I don't limit myself"

[http://universidad.uci.cu/email.gif]
<http://www.uci.cu/>

[http://universidad.uci.cu/email.gif]
<http://www.uci.cu/>
________________________________
This electronic message is intended to be for the use only of the named recipient, and may contain information that is confidential or privileged. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or use of the contents of this message is strictly prohibited. If you have received this message in error or are not the named recipient, please notify us immediately by contacting the sender at the electronic mail address noted above, and delete and destroy all copies of this message. Thank you.

--------------------

This electronic message is intended to be for the use only of the named recipient, and may contain information that is confidential or privileged.  If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or use of the contents of this message is strictly prohibited.  If you have received this message in error or are not the named recipient, please notify us immediately by contacting the sender at the electronic mail address noted above, and delete and destroy all copies of this message.  Thank you.

--------------------

This electronic message is intended to be for the use only of the named recipient, and may contain information that is confidential or privileged.  If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or use of the contents of this message is strictly prohibited.  If you have received this message in error or are not the named recipient, please notify us immediately by contacting the sender at the electronic mail address noted above, and delete and destroy all copies of this message.  Thank you.

Re: Hadoop and MainFrame integration

Posted by Artem Ervits <ar...@nyp.org>.

Can you read the data off backup tapes and dump it to flat files?

Artem Ervits
Data Analyst
New York Presbyterian Hospital

From: Marcos Ortiz [mailto:mlortiz@uci.cu]
Sent: Tuesday, August 28, 2012 06:51 PM
To: user@hadoop.apache.org <us...@hadoop.apache.org>
Cc: Siddharth Tiwari <si...@live.com>
Subject: Re: Hadoop and MainFrame integration

The problem with it, is that Hadoop depends on top of HDFS to storage in blocks of 64/128 MB of size (or the size that you determine, 64 MB is the de-facto size), and then make the calculations.
So, you need to move all your data to a HDFS cluster to use data in MapReduce jobs if you want to make the calculations with Hadoop.
Best wishes

El 28/08/2012 12:24, Siddharth Tiwari escribió:
Hi Users.

We have flat files on mainframes with around a billion records. We need to sort them and then use them with different jobs on mainframe for report generation. I was wondering was there any way I could integrate the mainframe with hadoop do the sorting and keep the file on the sever itself ( I do not want to ftp the file to a hadoop cluster and then ftp back the sorted file to Mainframe as it would waste MIPS and nullify the advantage ). This way I could save on MIPS and ultimately improve profitability.

Thank you in advance

*------------------------*
Cheers !!!
Siddharth Tiwari
Have a refreshing day !!!
"Every duty is holy, and devotion to duty is the highest form of worship of God.�
"Maybe other people will try to limit me but I don't limit myself"

[http://universidad.uci.cu/email.gif]
<http://www.uci.cu/>

[http://universidad.uci.cu/email.gif]
<http://www.uci.cu/>
________________________________
This electronic message is intended to be for the use only of the named recipient, and may contain information that is confidential or privileged. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or use of the contents of this message is strictly prohibited. If you have received this message in error or are not the named recipient, please notify us immediately by contacting the sender at the electronic mail address noted above, and delete and destroy all copies of this message. Thank you.

--------------------

This electronic message is intended to be for the use only of the named recipient, and may contain information that is confidential or privileged.  If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or use of the contents of this message is strictly prohibited.  If you have received this message in error or are not the named recipient, please notify us immediately by contacting the sender at the electronic mail address noted above, and delete and destroy all copies of this message.  Thank you.

--------------------

This electronic message is intended to be for the use only of the named recipient, and may contain information that is confidential or privileged.  If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or use of the contents of this message is strictly prohibited.  If you have received this message in error or are not the named recipient, please notify us immediately by contacting the sender at the electronic mail address noted above, and delete and destroy all copies of this message.  Thank you.

Re: Hadoop and MainFrame integration

Posted by Artem Ervits <ar...@nyp.org>.

Can you read the data off backup tapes and dump it to flat files?

Artem Ervits
Data Analyst
New York Presbyterian Hospital

From: Marcos Ortiz [mailto:mlortiz@uci.cu]
Sent: Tuesday, August 28, 2012 06:51 PM
To: user@hadoop.apache.org <us...@hadoop.apache.org>
Cc: Siddharth Tiwari <si...@live.com>
Subject: Re: Hadoop and MainFrame integration

The problem with it, is that Hadoop depends on top of HDFS to storage in blocks of 64/128 MB of size (or the size that you determine, 64 MB is the de-facto size), and then make the calculations.
So, you need to move all your data to a HDFS cluster to use data in MapReduce jobs if you want to make the calculations with Hadoop.
Best wishes

El 28/08/2012 12:24, Siddharth Tiwari escribió:
Hi Users.

We have flat files on mainframes with around a billion records. We need to sort them and then use them with different jobs on mainframe for report generation. I was wondering was there any way I could integrate the mainframe with hadoop do the sorting and keep the file on the sever itself ( I do not want to ftp the file to a hadoop cluster and then ftp back the sorted file to Mainframe as it would waste MIPS and nullify the advantage ). This way I could save on MIPS and ultimately improve profitability.

Thank you in advance

*------------------------*
Cheers !!!
Siddharth Tiwari
Have a refreshing day !!!
"Every duty is holy, and devotion to duty is the highest form of worship of God.�
"Maybe other people will try to limit me but I don't limit myself"

[http://universidad.uci.cu/email.gif]
<http://www.uci.cu/>

[http://universidad.uci.cu/email.gif]
<http://www.uci.cu/>
________________________________
This electronic message is intended to be for the use only of the named recipient, and may contain information that is confidential or privileged. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or use of the contents of this message is strictly prohibited. If you have received this message in error or are not the named recipient, please notify us immediately by contacting the sender at the electronic mail address noted above, and delete and destroy all copies of this message. Thank you.

--------------------

This electronic message is intended to be for the use only of the named recipient, and may contain information that is confidential or privileged.  If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or use of the contents of this message is strictly prohibited.  If you have received this message in error or are not the named recipient, please notify us immediately by contacting the sender at the electronic mail address noted above, and delete and destroy all copies of this message.  Thank you.

--------------------

This electronic message is intended to be for the use only of the named recipient, and may contain information that is confidential or privileged.  If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or use of the contents of this message is strictly prohibited.  If you have received this message in error or are not the named recipient, please notify us immediately by contacting the sender at the electronic mail address noted above, and delete and destroy all copies of this message.  Thank you.

Re: Hadoop and MainFrame integration

Posted by Marcos Ortiz <ml...@uci.cu>.

The problem with it, is that Hadoop depends on top of HDFS to storage in 
blocks of 64/128 MB of size (or the size that you determine, 64 MB is 
the de-facto size), and then make the calculations.
So, you need to move all your data to a HDFS cluster to use data in 
MapReduce jobs if you want to make the calculations with Hadoop.
Best wishes

El 28/08/2012 12:24, Siddharth Tiwari escribió:
> Hi Users.
>
> We have flat files on mainframes with around a billion records. We 
> need to sort them and then use them with different jobs on mainframe 
> for report generation. I was wondering was there any way I could 
> integrate the mainframe with hadoop do the sorting and keep the file 
> on the sever itself ( I do not want to ftp the file to a hadoop 
> cluster and then ftp back the sorted file to Mainframe as it would 
> waste MIPS and nullify the advantage ). This way I could save on MIPS 
> and ultimately improve profitability.
>
> Thank you in advance
>
>
> **------------------------**
> *_Cheers !!!_*
> *Siddharth Tiwari*
> Have a refreshing day !!!
> *"Every duty is holy, and devotion to duty is the highest form of 
> worship of God.” *
> *"Maybe other people will try to limit me but I don't limit myself"*
>
>
> <http://www.uci.cu/>




10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS INFORMATICAS...
CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION

http://www.uci.cu
http://www.facebook.com/universidad.uci
http://www.flickr.com/photos/universidad_uci

Re: Hadoop and MainFrame integration

Posted by Marcos Ortiz <ml...@uci.cu>.

The problem with it, is that Hadoop depends on top of HDFS to storage in 
blocks of 64/128 MB of size (or the size that you determine, 64 MB is 
the de-facto size), and then make the calculations.
So, you need to move all your data to a HDFS cluster to use data in 
MapReduce jobs if you want to make the calculations with Hadoop.
Best wishes

El 28/08/2012 12:24, Siddharth Tiwari escribió:
> Hi Users.
>
> We have flat files on mainframes with around a billion records. We 
> need to sort them and then use them with different jobs on mainframe 
> for report generation. I was wondering was there any way I could 
> integrate the mainframe with hadoop do the sorting and keep the file 
> on the sever itself ( I do not want to ftp the file to a hadoop 
> cluster and then ftp back the sorted file to Mainframe as it would 
> waste MIPS and nullify the advantage ). This way I could save on MIPS 
> and ultimately improve profitability.
>
> Thank you in advance
>
>
> **------------------------**
> *_Cheers !!!_*
> *Siddharth Tiwari*
> Have a refreshing day !!!
> *"Every duty is holy, and devotion to duty is the highest form of 
> worship of God.” *
> *"Maybe other people will try to limit me but I don't limit myself"*
>
>
> <http://www.uci.cu/>




10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS INFORMATICAS...
CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION

http://www.uci.cu
http://www.facebook.com/universidad.uci
http://www.flickr.com/photos/universidad_uci

Re: Hadoop and MainFrame integration

Posted by Marcos Ortiz <ml...@uci.cu>.

The problem with it, is that Hadoop depends on top of HDFS to storage in 
blocks of 64/128 MB of size (or the size that you determine, 64 MB is 
the de-facto size), and then make the calculations.
So, you need to move all your data to a HDFS cluster to use data in 
MapReduce jobs if you want to make the calculations with Hadoop.
Best wishes

El 28/08/2012 12:24, Siddharth Tiwari escribió:
> Hi Users.
>
> We have flat files on mainframes with around a billion records. We 
> need to sort them and then use them with different jobs on mainframe 
> for report generation. I was wondering was there any way I could 
> integrate the mainframe with hadoop do the sorting and keep the file 
> on the sever itself ( I do not want to ftp the file to a hadoop 
> cluster and then ftp back the sorted file to Mainframe as it would 
> waste MIPS and nullify the advantage ). This way I could save on MIPS 
> and ultimately improve profitability.
>
> Thank you in advance
>
>
> **------------------------**
> *_Cheers !!!_*
> *Siddharth Tiwari*
> Have a refreshing day !!!
> *"Every duty is holy, and devotion to duty is the highest form of 
> worship of God.” *
> *"Maybe other people will try to limit me but I don't limit myself"*
>
>
> <http://www.uci.cu/>




10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS INFORMATICAS...
CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION

http://www.uci.cu
http://www.facebook.com/universidad.uci
http://www.flickr.com/photos/universidad_uci

Re: Hadoop and MainFrame integration

Posted by modemide <mo...@gmail.com>.

At some point in the work flow you're going to have to transfer the file
from the mainframe to the Hadoop cluster for processing, and then send it
back for storage on the mainframe.

You should be able to automate the process of sending the files back and
forth.

It's been my experience that it's often faster to process and sort large
files on a Hadoop cluster even while factoring in the cost to transfer
to/from the mainframe.

Hopefully that answers your question.  If not, are you looking to actually
use Hadoop to process files in place on the mainframe?  That concept
conflicts with my understanding of Hadoop.

On Tue, Aug 28, 2012 at 12:24 PM, Siddharth Tiwari <
siddharth.tiwari@live.com> wrote:

>  Hi Users.
>
> We have flat files on mainframes with around a billion records. We need to
> sort them and then use them with different jobs on mainframe for report
> generation. I was wondering was there any way I could integrate the
> mainframe with hadoop do the sorting and keep the file on the sever itself
> ( I do not want to ftp the file to a hadoop cluster and then ftp back the
> sorted file to Mainframe as it would waste MIPS and nullify the advantage
> ). This way I could save on MIPS and ultimately improve profitability.
>
> Thank you in advance
>
>
> **------------------------**
> *Cheers !!!*
> *Siddharth Tiwari*
> Have a refreshing day !!!
> *"Every duty is holy, and devotion to duty is the highest form of worship
> of God.” *
> *"Maybe other people will try to limit me but I don't limit myself"*
>

Re: Hadoop and MainFrame integration

Posted by Mathias Herberts <ma...@gmail.com>.

build a custom transfer mechanism in Java and use a zaap so you won't
consume mips
On Aug 28, 2012 6:24 PM, "Siddharth Tiwari" <si...@live.com>
wrote:

>  Hi Users.
>
> We have flat files on mainframes with around a billion records. We need to
> sort them and then use them with different jobs on mainframe for report
> generation. I was wondering was there any way I could integrate the
> mainframe with hadoop do the sorting and keep the file on the sever itself
> ( I do not want to ftp the file to a hadoop cluster and then ftp back the
> sorted file to Mainframe as it would waste MIPS and nullify the advantage
> ). This way I could save on MIPS and ultimately improve profitability.
>
> Thank you in advance
>
>
> **------------------------**
> *Cheers !!!*
> *Siddharth Tiwari*
> Have a refreshing day !!!
> *"Every duty is holy, and devotion to duty is the highest form of worship
> of God.” *
> *"Maybe other people will try to limit me but I don't limit myself"*
>

Re: Hadoop and MainFrame integration

Posted by Mathias Herberts <ma...@gmail.com>.

build a custom transfer mechanism in Java and use a zaap so you won't
consume mips
On Aug 28, 2012 6:24 PM, "Siddharth Tiwari" <si...@live.com>
wrote:

>  Hi Users.
>
> We have flat files on mainframes with around a billion records. We need to
> sort them and then use them with different jobs on mainframe for report
> generation. I was wondering was there any way I could integrate the
> mainframe with hadoop do the sorting and keep the file on the sever itself
> ( I do not want to ftp the file to a hadoop cluster and then ftp back the
> sorted file to Mainframe as it would waste MIPS and nullify the advantage
> ). This way I could save on MIPS and ultimately improve profitability.
>
> Thank you in advance
>
>
> **------------------------**
> *Cheers !!!*
> *Siddharth Tiwari*
> Have a refreshing day !!!
> *"Every duty is holy, and devotion to duty is the highest form of worship
> of God.” *
> *"Maybe other people will try to limit me but I don't limit myself"*
>

Re: Hadoop and MainFrame integration

Posted by modemide <mo...@gmail.com>.

At some point in the work flow you're going to have to transfer the file
from the mainframe to the Hadoop cluster for processing, and then send it
back for storage on the mainframe.

You should be able to automate the process of sending the files back and
forth.

It's been my experience that it's often faster to process and sort large
files on a Hadoop cluster even while factoring in the cost to transfer
to/from the mainframe.

Hopefully that answers your question.  If not, are you looking to actually
use Hadoop to process files in place on the mainframe?  That concept
conflicts with my understanding of Hadoop.

On Tue, Aug 28, 2012 at 12:24 PM, Siddharth Tiwari <
siddharth.tiwari@live.com> wrote:

>  Hi Users.
>
> We have flat files on mainframes with around a billion records. We need to
> sort them and then use them with different jobs on mainframe for report
> generation. I was wondering was there any way I could integrate the
> mainframe with hadoop do the sorting and keep the file on the sever itself
> ( I do not want to ftp the file to a hadoop cluster and then ftp back the
> sorted file to Mainframe as it would waste MIPS and nullify the advantage
> ). This way I could save on MIPS and ultimately improve profitability.
>
> Thank you in advance
>
>
> **------------------------**
> *Cheers !!!*
> *Siddharth Tiwari*
> Have a refreshing day !!!
> *"Every duty is holy, and devotion to duty is the highest form of worship
> of God.” *
> *"Maybe other people will try to limit me but I don't limit myself"*
>

Re: Hadoop and MainFrame integration

Posted by Marcos Ortiz <ml...@uci.cu>.

The problem with it, is that Hadoop depends on top of HDFS to storage in 
blocks of 64/128 MB of size (or the size that you determine, 64 MB is 
the de-facto size), and then make the calculations.
So, you need to move all your data to a HDFS cluster to use data in 
MapReduce jobs if you want to make the calculations with Hadoop.
Best wishes

El 28/08/2012 12:24, Siddharth Tiwari escribió:
> Hi Users.
>
> We have flat files on mainframes with around a billion records. We 
> need to sort them and then use them with different jobs on mainframe 
> for report generation. I was wondering was there any way I could 
> integrate the mainframe with hadoop do the sorting and keep the file 
> on the sever itself ( I do not want to ftp the file to a hadoop 
> cluster and then ftp back the sorted file to Mainframe as it would 
> waste MIPS and nullify the advantage ). This way I could save on MIPS 
> and ultimately improve profitability.
>
> Thank you in advance
>
>
> **------------------------**
> *_Cheers !!!_*
> *Siddharth Tiwari*
> Have a refreshing day !!!
> *"Every duty is holy, and devotion to duty is the highest form of 
> worship of God.” *
> *"Maybe other people will try to limit me but I don't limit myself"*
>
>
> <http://www.uci.cu/>




10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS INFORMATICAS...
CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION

http://www.uci.cu
http://www.facebook.com/universidad.uci
http://www.flickr.com/photos/universidad_uci

Re: Hadoop and MainFrame integration

Posted by Steve Loughran <st...@hortonworks.com>.

On 28 August 2012 09:24, Siddharth Tiwari <si...@live.com> wrote:

>  Hi Users.
>
> We have flat files on mainframes with around a billion records. We need to
> sort them and then use them with different jobs on mainframe for report
> generation. I was wondering was there any way I could integrate the
> mainframe with hadoop do the sorting and keep the file on the sever itself
> ( I do not want to ftp the file to a hadoop cluster and then ftp back the
> sorted file to Mainframe as it would waste MIPS and nullify the advantage
> ). This way I could save on MIPS and ultimately improve profitability.
>
>
Can you NFS-mount the mainframe filesystem from the Hadoop cluster?
Otherwise, do you or your mainframe vendor have a custom Hadoop filesystem
binding for the mainframe?

If not, you should be able to use ftp:// URLs as the source of data for the
initial MR job; at the end of the sequence of MR jobs the result can go
back to the mainframe;

Re: Hadoop and MainFrame integration

Posted by Mathias Herberts <ma...@gmail.com>.

build a custom transfer mechanism in Java and use a zaap so you won't
consume mips
On Aug 28, 2012 6:24 PM, "Siddharth Tiwari" <si...@live.com>
wrote:

>  Hi Users.
>
> We have flat files on mainframes with around a billion records. We need to
> sort them and then use them with different jobs on mainframe for report
> generation. I was wondering was there any way I could integrate the
> mainframe with hadoop do the sorting and keep the file on the sever itself
> ( I do not want to ftp the file to a hadoop cluster and then ftp back the
> sorted file to Mainframe as it would waste MIPS and nullify the advantage
> ). This way I could save on MIPS and ultimately improve profitability.
>
> Thank you in advance
>
>
> **------------------------**
> *Cheers !!!*
> *Siddharth Tiwari*
> Have a refreshing day !!!
> *"Every duty is holy, and devotion to duty is the highest form of worship
> of God.” *
> *"Maybe other people will try to limit me but I don't limit myself"*
>