You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Sandeep Nemuri <nh...@gmail.com> on 2013/07/23 08:24:23 UTC

Copy data from Mainframe to HDFS

Hi ,

"How to copy datasets from Mainframe to HDFS directly?  I know that we can
NDM files to Linux box and then we can use simple put command to copy data
to HDFS.  But, how to copy data directly from mainframe to HDFS?  I have
PS, PDS and VSAM datasets to copy to HDFS for analysis using MapReduce.

Also, Do we need to convert data from EBCDIC to ASCII before copy? "

-- 
--Regards
  Sandeep Nemuri

Re: Copy data from Mainframe to HDFS

Posted by Jun Ping Du <jd...@vmware.com>.
Hi Sandeep, 
I think Apache Oozie is something you are looking for. It provide workflow management on Hadoop (and Pig, Hive, etc.) jobs and support continuously run jobs in specific time period. Please refer: http://oozie.apache.org/docs/3.3.2/ for details. 

Thanks, 

Junping 

----- Original Message -----

From: "Sandeep Nemuri" <nh...@gmail.com> 
To: user@hadoop.apache.org 
Sent: Tuesday, July 23, 2013 7:04:56 PM 
Subject: Re: Copy data from Mainframe to HDFS 

Thanks for your reply guys , 
i am looking for open source do we have any ?? 


On Tue, Jul 23, 2013 at 12:53 PM, Devaraj k < devaraj.k@huawei.com > wrote: 





Hi Balamurali, 



As per my knowledge, there is nothing in the hadoop which does exactly as per your requirement. 



You can write mapreduce jobs according to your functionality and submit hourly/daily/weekly or monthly . And then you can aggregate the results. 



If you want some help regarding Hive, you can ask the same in Hive mailing list. 







Thanks 

Devaraj k 




From: Balamurali [mailto: balamurali102@gmail.com ] 
Sent: 23 July 2013 12:42 
To: user 
Subject: Re: Copy data from Mainframe to HDFS 





Hi, 



I configured hadoop-1.0.3, hbase-0.92.1 and hive-0.10.0 . 
Created table in HBase.Inserted records.Processing the data using Hive. 


I have to show a graph with some points ( 7 - 7 days or 12 for one year).In one day records may include 1000 - lacks.I need to show average of these 1000 - lacks records.is there any built in haddop mechanism to process these records fast. 


Also I need to run a hive query or job (when we run a hive query actually a job is submitting) in every 1 hour.Is there a scheduling mechanism in hadoop to handle thsese 



Please reply. 


Balamurali 





On Tue, Jul 23, 2013 at 12:24 PM, Mohammad Tariq < dontariq@gmail.com > wrote: 


Hello Sandeep, 





You don't have to convert the data in order to copy it into the HDFS. But you might have to think about the MR processing of these files because of the format of these files. 





You could probably make use of Sqoop . 





I also came across DMX-H a few days ago while browsing. I don't know anything about the licensing and how good it is. Just thought of sharing it with you. You can visit their page to see more. They also provide a VM(includes CDH) to get started quickly. 





Warm Regards, 


Tariq 


cloudfront.blogspot.com 





On Tue, Jul 23, 2013 at 11:54 AM, Sandeep Nemuri < nhsandeep6@gmail.com > wrote: 


Hi , 





"How to copy datasets from Mainframe to HDFS directly? I know that we can NDM files to Linux box and then we can use simple put command to copy data to HDFS. But, how to copy data directly from mainframe to HDFS? I have PS, PDS and VSAM datasets to copy to HDFS for analysis using MapReduce. 





Also, Do we need to convert data from EBCDIC to ASCII before copy? " 





-- 


--Regards 


Sandeep Nemuri 












-- 
--Regards 
Sandeep Nemuri 


Re: Copy data from Mainframe to HDFS

Posted by Jun Ping Du <jd...@vmware.com>.
Hi Sandeep, 
I think Apache Oozie is something you are looking for. It provide workflow management on Hadoop (and Pig, Hive, etc.) jobs and support continuously run jobs in specific time period. Please refer: http://oozie.apache.org/docs/3.3.2/ for details. 

Thanks, 

Junping 

----- Original Message -----

From: "Sandeep Nemuri" <nh...@gmail.com> 
To: user@hadoop.apache.org 
Sent: Tuesday, July 23, 2013 7:04:56 PM 
Subject: Re: Copy data from Mainframe to HDFS 

Thanks for your reply guys , 
i am looking for open source do we have any ?? 


On Tue, Jul 23, 2013 at 12:53 PM, Devaraj k < devaraj.k@huawei.com > wrote: 





Hi Balamurali, 



As per my knowledge, there is nothing in the hadoop which does exactly as per your requirement. 



You can write mapreduce jobs according to your functionality and submit hourly/daily/weekly or monthly . And then you can aggregate the results. 



If you want some help regarding Hive, you can ask the same in Hive mailing list. 







Thanks 

Devaraj k 




From: Balamurali [mailto: balamurali102@gmail.com ] 
Sent: 23 July 2013 12:42 
To: user 
Subject: Re: Copy data from Mainframe to HDFS 





Hi, 



I configured hadoop-1.0.3, hbase-0.92.1 and hive-0.10.0 . 
Created table in HBase.Inserted records.Processing the data using Hive. 


I have to show a graph with some points ( 7 - 7 days or 12 for one year).In one day records may include 1000 - lacks.I need to show average of these 1000 - lacks records.is there any built in haddop mechanism to process these records fast. 


Also I need to run a hive query or job (when we run a hive query actually a job is submitting) in every 1 hour.Is there a scheduling mechanism in hadoop to handle thsese 



Please reply. 


Balamurali 





On Tue, Jul 23, 2013 at 12:24 PM, Mohammad Tariq < dontariq@gmail.com > wrote: 


Hello Sandeep, 





You don't have to convert the data in order to copy it into the HDFS. But you might have to think about the MR processing of these files because of the format of these files. 





You could probably make use of Sqoop . 





I also came across DMX-H a few days ago while browsing. I don't know anything about the licensing and how good it is. Just thought of sharing it with you. You can visit their page to see more. They also provide a VM(includes CDH) to get started quickly. 





Warm Regards, 


Tariq 


cloudfront.blogspot.com 





On Tue, Jul 23, 2013 at 11:54 AM, Sandeep Nemuri < nhsandeep6@gmail.com > wrote: 


Hi , 





"How to copy datasets from Mainframe to HDFS directly? I know that we can NDM files to Linux box and then we can use simple put command to copy data to HDFS. But, how to copy data directly from mainframe to HDFS? I have PS, PDS and VSAM datasets to copy to HDFS for analysis using MapReduce. 





Also, Do we need to convert data from EBCDIC to ASCII before copy? " 





-- 


--Regards 


Sandeep Nemuri 












-- 
--Regards 
Sandeep Nemuri 


Re: Copy data from Mainframe to HDFS

Posted by Jun Ping Du <jd...@vmware.com>.
Hi Sandeep, 
I think Apache Oozie is something you are looking for. It provide workflow management on Hadoop (and Pig, Hive, etc.) jobs and support continuously run jobs in specific time period. Please refer: http://oozie.apache.org/docs/3.3.2/ for details. 

Thanks, 

Junping 

----- Original Message -----

From: "Sandeep Nemuri" <nh...@gmail.com> 
To: user@hadoop.apache.org 
Sent: Tuesday, July 23, 2013 7:04:56 PM 
Subject: Re: Copy data from Mainframe to HDFS 

Thanks for your reply guys , 
i am looking for open source do we have any ?? 


On Tue, Jul 23, 2013 at 12:53 PM, Devaraj k < devaraj.k@huawei.com > wrote: 





Hi Balamurali, 



As per my knowledge, there is nothing in the hadoop which does exactly as per your requirement. 



You can write mapreduce jobs according to your functionality and submit hourly/daily/weekly or monthly . And then you can aggregate the results. 



If you want some help regarding Hive, you can ask the same in Hive mailing list. 







Thanks 

Devaraj k 




From: Balamurali [mailto: balamurali102@gmail.com ] 
Sent: 23 July 2013 12:42 
To: user 
Subject: Re: Copy data from Mainframe to HDFS 





Hi, 



I configured hadoop-1.0.3, hbase-0.92.1 and hive-0.10.0 . 
Created table in HBase.Inserted records.Processing the data using Hive. 


I have to show a graph with some points ( 7 - 7 days or 12 for one year).In one day records may include 1000 - lacks.I need to show average of these 1000 - lacks records.is there any built in haddop mechanism to process these records fast. 


Also I need to run a hive query or job (when we run a hive query actually a job is submitting) in every 1 hour.Is there a scheduling mechanism in hadoop to handle thsese 



Please reply. 


Balamurali 





On Tue, Jul 23, 2013 at 12:24 PM, Mohammad Tariq < dontariq@gmail.com > wrote: 


Hello Sandeep, 





You don't have to convert the data in order to copy it into the HDFS. But you might have to think about the MR processing of these files because of the format of these files. 





You could probably make use of Sqoop . 





I also came across DMX-H a few days ago while browsing. I don't know anything about the licensing and how good it is. Just thought of sharing it with you. You can visit their page to see more. They also provide a VM(includes CDH) to get started quickly. 





Warm Regards, 


Tariq 


cloudfront.blogspot.com 





On Tue, Jul 23, 2013 at 11:54 AM, Sandeep Nemuri < nhsandeep6@gmail.com > wrote: 


Hi , 





"How to copy datasets from Mainframe to HDFS directly? I know that we can NDM files to Linux box and then we can use simple put command to copy data to HDFS. But, how to copy data directly from mainframe to HDFS? I have PS, PDS and VSAM datasets to copy to HDFS for analysis using MapReduce. 





Also, Do we need to convert data from EBCDIC to ASCII before copy? " 





-- 


--Regards 


Sandeep Nemuri 












-- 
--Regards 
Sandeep Nemuri 


Re: Copy data from Mainframe to HDFS

Posted by Jun Ping Du <jd...@vmware.com>.
Hi Sandeep, 
I think Apache Oozie is something you are looking for. It provide workflow management on Hadoop (and Pig, Hive, etc.) jobs and support continuously run jobs in specific time period. Please refer: http://oozie.apache.org/docs/3.3.2/ for details. 

Thanks, 

Junping 

----- Original Message -----

From: "Sandeep Nemuri" <nh...@gmail.com> 
To: user@hadoop.apache.org 
Sent: Tuesday, July 23, 2013 7:04:56 PM 
Subject: Re: Copy data from Mainframe to HDFS 

Thanks for your reply guys , 
i am looking for open source do we have any ?? 


On Tue, Jul 23, 2013 at 12:53 PM, Devaraj k < devaraj.k@huawei.com > wrote: 





Hi Balamurali, 



As per my knowledge, there is nothing in the hadoop which does exactly as per your requirement. 



You can write mapreduce jobs according to your functionality and submit hourly/daily/weekly or monthly . And then you can aggregate the results. 



If you want some help regarding Hive, you can ask the same in Hive mailing list. 







Thanks 

Devaraj k 




From: Balamurali [mailto: balamurali102@gmail.com ] 
Sent: 23 July 2013 12:42 
To: user 
Subject: Re: Copy data from Mainframe to HDFS 





Hi, 



I configured hadoop-1.0.3, hbase-0.92.1 and hive-0.10.0 . 
Created table in HBase.Inserted records.Processing the data using Hive. 


I have to show a graph with some points ( 7 - 7 days or 12 for one year).In one day records may include 1000 - lacks.I need to show average of these 1000 - lacks records.is there any built in haddop mechanism to process these records fast. 


Also I need to run a hive query or job (when we run a hive query actually a job is submitting) in every 1 hour.Is there a scheduling mechanism in hadoop to handle thsese 



Please reply. 


Balamurali 





On Tue, Jul 23, 2013 at 12:24 PM, Mohammad Tariq < dontariq@gmail.com > wrote: 


Hello Sandeep, 





You don't have to convert the data in order to copy it into the HDFS. But you might have to think about the MR processing of these files because of the format of these files. 





You could probably make use of Sqoop . 





I also came across DMX-H a few days ago while browsing. I don't know anything about the licensing and how good it is. Just thought of sharing it with you. You can visit their page to see more. They also provide a VM(includes CDH) to get started quickly. 





Warm Regards, 


Tariq 


cloudfront.blogspot.com 





On Tue, Jul 23, 2013 at 11:54 AM, Sandeep Nemuri < nhsandeep6@gmail.com > wrote: 


Hi , 





"How to copy datasets from Mainframe to HDFS directly? I know that we can NDM files to Linux box and then we can use simple put command to copy data to HDFS. But, how to copy data directly from mainframe to HDFS? I have PS, PDS and VSAM datasets to copy to HDFS for analysis using MapReduce. 





Also, Do we need to convert data from EBCDIC to ASCII before copy? " 





-- 


--Regards 


Sandeep Nemuri 












-- 
--Regards 
Sandeep Nemuri 


Re: Copy data from Mainframe to HDFS

Posted by Sandeep Nemuri <nh...@gmail.com>.
Thanks for your reply guys ,
i am looking for open source do we have any ??


On Tue, Jul 23, 2013 at 12:53 PM, Devaraj k <de...@huawei.com> wrote:

>  Hi Balamurali,****
>
> ** **
>
> As per my knowledge, there is nothing in the hadoop which does exactly as
> per your requirement.****
>
> ** **
>
> You can write mapreduce jobs according to your functionality and submit
> hourly/daily/weekly or monthly . And then you can aggregate the results.**
> **
>
> ** **
>
> If you want some help regarding Hive, you can ask the same in Hive mailing
> list.****
>
> ** **
>
> ** **
>
> ** **
>
> Thanks****
>
> Devaraj k****
>
> ** **
>
> *From:* Balamurali [mailto:balamurali102@gmail.com]
> *Sent:* 23 July 2013 12:42
> *To:* user
> *Subject:* Re: Copy data from Mainframe to HDFS****
>
> ** **
>
> Hi,****
>
>
> I configured hadoop-1.0.3, hbase-0.92.1 and hive-0.10.0 .
> Created table in HBase.Inserted records.Processing the data using Hive.***
> *
>
> I have to show a graph with some points ( 7 - 7 days or 12 for one
> year).In one day records may include 1000 - lacks.I need to show average of
> these 1000 - lacks records.is there any built in haddop mechanism to
> process these records fast.****
>
> Also I need to run a hive query  or job (when we run a hive query actually
> a job is submitting) in every 1 hour.Is there a scheduling mechanism in
> hadoop to handle thsese
>
> ****
>
> Please reply.****
>
> Balamurali****
>
> ** **
>
> On Tue, Jul 23, 2013 at 12:24 PM, Mohammad Tariq <do...@gmail.com>
> wrote:****
>
> Hello Sandeep,****
>
> ** **
>
> You don't have to convert the data in order to copy it into the HDFS. But
> you might have to think about the MR processing of these files because of
> the format of these files.****
>
> ** **
>
> You could probably make use of Sqoop <http://sqoop.apache.org/>.****
>
> ** **
>
> I also came across DMX-H a few days ago while browsing. I don't know
> anything about the licensing and how good it is. Just thought of sharing it
> with you. You can visit their page<http://www.syncsort.com/en/Data-Integration/Home>to see more. They also provide a VM(includes CDH) to get started quickly.
> ****
>
>
> ****
>
> Warm Regards,****
>
> Tariq****
>
> cloudfront.blogspot.com****
>
> ** **
>
> On Tue, Jul 23, 2013 at 11:54 AM, Sandeep Nemuri <nh...@gmail.com>
> wrote:****
>
> Hi , ****
>
> ** **
>
> "How to copy datasets from Mainframe to HDFS directly?  I know that we can
> NDM files to Linux box and then we can use simple put command to copy data
> to HDFS.  But, how to copy data directly from mainframe to HDFS?  I have
> PS, PDS and VSAM datasets to copy to HDFS for analysis using MapReduce. **
> **
>
> ** **
>
> Also, Do we need to convert data from EBCDIC to ASCII before copy? "****
>
> ** **
>
> -- ****
>
> --Regards****
>
>   Sandeep Nemuri****
>
> ** **
>
> ** **
>



-- 
--Regards
  Sandeep Nemuri

Re: Copy data from Mainframe to HDFS

Posted by Sandeep Nemuri <nh...@gmail.com>.
Thanks for your reply guys ,
i am looking for open source do we have any ??


On Tue, Jul 23, 2013 at 12:53 PM, Devaraj k <de...@huawei.com> wrote:

>  Hi Balamurali,****
>
> ** **
>
> As per my knowledge, there is nothing in the hadoop which does exactly as
> per your requirement.****
>
> ** **
>
> You can write mapreduce jobs according to your functionality and submit
> hourly/daily/weekly or monthly . And then you can aggregate the results.**
> **
>
> ** **
>
> If you want some help regarding Hive, you can ask the same in Hive mailing
> list.****
>
> ** **
>
> ** **
>
> ** **
>
> Thanks****
>
> Devaraj k****
>
> ** **
>
> *From:* Balamurali [mailto:balamurali102@gmail.com]
> *Sent:* 23 July 2013 12:42
> *To:* user
> *Subject:* Re: Copy data from Mainframe to HDFS****
>
> ** **
>
> Hi,****
>
>
> I configured hadoop-1.0.3, hbase-0.92.1 and hive-0.10.0 .
> Created table in HBase.Inserted records.Processing the data using Hive.***
> *
>
> I have to show a graph with some points ( 7 - 7 days or 12 for one
> year).In one day records may include 1000 - lacks.I need to show average of
> these 1000 - lacks records.is there any built in haddop mechanism to
> process these records fast.****
>
> Also I need to run a hive query  or job (when we run a hive query actually
> a job is submitting) in every 1 hour.Is there a scheduling mechanism in
> hadoop to handle thsese
>
> ****
>
> Please reply.****
>
> Balamurali****
>
> ** **
>
> On Tue, Jul 23, 2013 at 12:24 PM, Mohammad Tariq <do...@gmail.com>
> wrote:****
>
> Hello Sandeep,****
>
> ** **
>
> You don't have to convert the data in order to copy it into the HDFS. But
> you might have to think about the MR processing of these files because of
> the format of these files.****
>
> ** **
>
> You could probably make use of Sqoop <http://sqoop.apache.org/>.****
>
> ** **
>
> I also came across DMX-H a few days ago while browsing. I don't know
> anything about the licensing and how good it is. Just thought of sharing it
> with you. You can visit their page<http://www.syncsort.com/en/Data-Integration/Home>to see more. They also provide a VM(includes CDH) to get started quickly.
> ****
>
>
> ****
>
> Warm Regards,****
>
> Tariq****
>
> cloudfront.blogspot.com****
>
> ** **
>
> On Tue, Jul 23, 2013 at 11:54 AM, Sandeep Nemuri <nh...@gmail.com>
> wrote:****
>
> Hi , ****
>
> ** **
>
> "How to copy datasets from Mainframe to HDFS directly?  I know that we can
> NDM files to Linux box and then we can use simple put command to copy data
> to HDFS.  But, how to copy data directly from mainframe to HDFS?  I have
> PS, PDS and VSAM datasets to copy to HDFS for analysis using MapReduce. **
> **
>
> ** **
>
> Also, Do we need to convert data from EBCDIC to ASCII before copy? "****
>
> ** **
>
> -- ****
>
> --Regards****
>
>   Sandeep Nemuri****
>
> ** **
>
> ** **
>



-- 
--Regards
  Sandeep Nemuri

Re: Copy data from Mainframe to HDFS

Posted by Sandeep Nemuri <nh...@gmail.com>.
Thanks for your reply guys ,
i am looking for open source do we have any ??


On Tue, Jul 23, 2013 at 12:53 PM, Devaraj k <de...@huawei.com> wrote:

>  Hi Balamurali,****
>
> ** **
>
> As per my knowledge, there is nothing in the hadoop which does exactly as
> per your requirement.****
>
> ** **
>
> You can write mapreduce jobs according to your functionality and submit
> hourly/daily/weekly or monthly . And then you can aggregate the results.**
> **
>
> ** **
>
> If you want some help regarding Hive, you can ask the same in Hive mailing
> list.****
>
> ** **
>
> ** **
>
> ** **
>
> Thanks****
>
> Devaraj k****
>
> ** **
>
> *From:* Balamurali [mailto:balamurali102@gmail.com]
> *Sent:* 23 July 2013 12:42
> *To:* user
> *Subject:* Re: Copy data from Mainframe to HDFS****
>
> ** **
>
> Hi,****
>
>
> I configured hadoop-1.0.3, hbase-0.92.1 and hive-0.10.0 .
> Created table in HBase.Inserted records.Processing the data using Hive.***
> *
>
> I have to show a graph with some points ( 7 - 7 days or 12 for one
> year).In one day records may include 1000 - lacks.I need to show average of
> these 1000 - lacks records.is there any built in haddop mechanism to
> process these records fast.****
>
> Also I need to run a hive query  or job (when we run a hive query actually
> a job is submitting) in every 1 hour.Is there a scheduling mechanism in
> hadoop to handle thsese
>
> ****
>
> Please reply.****
>
> Balamurali****
>
> ** **
>
> On Tue, Jul 23, 2013 at 12:24 PM, Mohammad Tariq <do...@gmail.com>
> wrote:****
>
> Hello Sandeep,****
>
> ** **
>
> You don't have to convert the data in order to copy it into the HDFS. But
> you might have to think about the MR processing of these files because of
> the format of these files.****
>
> ** **
>
> You could probably make use of Sqoop <http://sqoop.apache.org/>.****
>
> ** **
>
> I also came across DMX-H a few days ago while browsing. I don't know
> anything about the licensing and how good it is. Just thought of sharing it
> with you. You can visit their page<http://www.syncsort.com/en/Data-Integration/Home>to see more. They also provide a VM(includes CDH) to get started quickly.
> ****
>
>
> ****
>
> Warm Regards,****
>
> Tariq****
>
> cloudfront.blogspot.com****
>
> ** **
>
> On Tue, Jul 23, 2013 at 11:54 AM, Sandeep Nemuri <nh...@gmail.com>
> wrote:****
>
> Hi , ****
>
> ** **
>
> "How to copy datasets from Mainframe to HDFS directly?  I know that we can
> NDM files to Linux box and then we can use simple put command to copy data
> to HDFS.  But, how to copy data directly from mainframe to HDFS?  I have
> PS, PDS and VSAM datasets to copy to HDFS for analysis using MapReduce. **
> **
>
> ** **
>
> Also, Do we need to convert data from EBCDIC to ASCII before copy? "****
>
> ** **
>
> -- ****
>
> --Regards****
>
>   Sandeep Nemuri****
>
> ** **
>
> ** **
>



-- 
--Regards
  Sandeep Nemuri

Re: Copy data from Mainframe to HDFS

Posted by Sandeep Nemuri <nh...@gmail.com>.
Thanks for your reply guys ,
i am looking for open source do we have any ??


On Tue, Jul 23, 2013 at 12:53 PM, Devaraj k <de...@huawei.com> wrote:

>  Hi Balamurali,****
>
> ** **
>
> As per my knowledge, there is nothing in the hadoop which does exactly as
> per your requirement.****
>
> ** **
>
> You can write mapreduce jobs according to your functionality and submit
> hourly/daily/weekly or monthly . And then you can aggregate the results.**
> **
>
> ** **
>
> If you want some help regarding Hive, you can ask the same in Hive mailing
> list.****
>
> ** **
>
> ** **
>
> ** **
>
> Thanks****
>
> Devaraj k****
>
> ** **
>
> *From:* Balamurali [mailto:balamurali102@gmail.com]
> *Sent:* 23 July 2013 12:42
> *To:* user
> *Subject:* Re: Copy data from Mainframe to HDFS****
>
> ** **
>
> Hi,****
>
>
> I configured hadoop-1.0.3, hbase-0.92.1 and hive-0.10.0 .
> Created table in HBase.Inserted records.Processing the data using Hive.***
> *
>
> I have to show a graph with some points ( 7 - 7 days or 12 for one
> year).In one day records may include 1000 - lacks.I need to show average of
> these 1000 - lacks records.is there any built in haddop mechanism to
> process these records fast.****
>
> Also I need to run a hive query  or job (when we run a hive query actually
> a job is submitting) in every 1 hour.Is there a scheduling mechanism in
> hadoop to handle thsese
>
> ****
>
> Please reply.****
>
> Balamurali****
>
> ** **
>
> On Tue, Jul 23, 2013 at 12:24 PM, Mohammad Tariq <do...@gmail.com>
> wrote:****
>
> Hello Sandeep,****
>
> ** **
>
> You don't have to convert the data in order to copy it into the HDFS. But
> you might have to think about the MR processing of these files because of
> the format of these files.****
>
> ** **
>
> You could probably make use of Sqoop <http://sqoop.apache.org/>.****
>
> ** **
>
> I also came across DMX-H a few days ago while browsing. I don't know
> anything about the licensing and how good it is. Just thought of sharing it
> with you. You can visit their page<http://www.syncsort.com/en/Data-Integration/Home>to see more. They also provide a VM(includes CDH) to get started quickly.
> ****
>
>
> ****
>
> Warm Regards,****
>
> Tariq****
>
> cloudfront.blogspot.com****
>
> ** **
>
> On Tue, Jul 23, 2013 at 11:54 AM, Sandeep Nemuri <nh...@gmail.com>
> wrote:****
>
> Hi , ****
>
> ** **
>
> "How to copy datasets from Mainframe to HDFS directly?  I know that we can
> NDM files to Linux box and then we can use simple put command to copy data
> to HDFS.  But, how to copy data directly from mainframe to HDFS?  I have
> PS, PDS and VSAM datasets to copy to HDFS for analysis using MapReduce. **
> **
>
> ** **
>
> Also, Do we need to convert data from EBCDIC to ASCII before copy? "****
>
> ** **
>
> -- ****
>
> --Regards****
>
>   Sandeep Nemuri****
>
> ** **
>
> ** **
>



-- 
--Regards
  Sandeep Nemuri

RE: Copy data from Mainframe to HDFS

Posted by Devaraj k <de...@huawei.com>.
Hi Balamurali,

As per my knowledge, there is nothing in the hadoop which does exactly as per your requirement.

You can write mapreduce jobs according to your functionality and submit hourly/daily/weekly or monthly . And then you can aggregate the results.

If you want some help regarding Hive, you can ask the same in Hive mailing list.



Thanks
Devaraj k

From: Balamurali [mailto:balamurali102@gmail.com]
Sent: 23 July 2013 12:42
To: user
Subject: Re: Copy data from Mainframe to HDFS

Hi,

I configured hadoop-1.0.3, hbase-0.92.1 and hive-0.10.0 .
Created table in HBase.Inserted records.Processing the data using Hive.
I have to show a graph with some points ( 7 - 7 days or 12 for one year).In one day records may include 1000 - lacks.I need to show average of these 1000 - lacks records.is<http://records.is> there any built in haddop mechanism to process these records fast.
Also I need to run a hive query  or job (when we run a hive query actually a job is submitting) in every 1 hour.Is there a scheduling mechanism in hadoop to handle thsese

Please reply.
Balamurali

On Tue, Jul 23, 2013 at 12:24 PM, Mohammad Tariq <do...@gmail.com>> wrote:
Hello Sandeep,

You don't have to convert the data in order to copy it into the HDFS. But you might have to think about the MR processing of these files because of the format of these files.

You could probably make use of Sqoop<http://sqoop.apache.org/>.

I also came across DMX-H a few days ago while browsing. I don't know anything about the licensing and how good it is. Just thought of sharing it with you. You can visit their page<http://www.syncsort.com/en/Data-Integration/Home> to see more. They also provide a VM(includes CDH) to get started quickly.

Warm Regards,
Tariq
cloudfront.blogspot.com<http://cloudfront.blogspot.com>

On Tue, Jul 23, 2013 at 11:54 AM, Sandeep Nemuri <nh...@gmail.com>> wrote:
Hi ,

"How to copy datasets from Mainframe to HDFS directly?  I know that we can NDM files to Linux box and then we can use simple put command to copy data to HDFS.  But, how to copy data directly from mainframe to HDFS?  I have PS, PDS and VSAM datasets to copy to HDFS for analysis using MapReduce.

Also, Do we need to convert data from EBCDIC to ASCII before copy? "

--
--Regards
  Sandeep Nemuri



RE: Copy data from Mainframe to HDFS

Posted by Devaraj k <de...@huawei.com>.
Hi Balamurali,

As per my knowledge, there is nothing in the hadoop which does exactly as per your requirement.

You can write mapreduce jobs according to your functionality and submit hourly/daily/weekly or monthly . And then you can aggregate the results.

If you want some help regarding Hive, you can ask the same in Hive mailing list.



Thanks
Devaraj k

From: Balamurali [mailto:balamurali102@gmail.com]
Sent: 23 July 2013 12:42
To: user
Subject: Re: Copy data from Mainframe to HDFS

Hi,

I configured hadoop-1.0.3, hbase-0.92.1 and hive-0.10.0 .
Created table in HBase.Inserted records.Processing the data using Hive.
I have to show a graph with some points ( 7 - 7 days or 12 for one year).In one day records may include 1000 - lacks.I need to show average of these 1000 - lacks records.is<http://records.is> there any built in haddop mechanism to process these records fast.
Also I need to run a hive query  or job (when we run a hive query actually a job is submitting) in every 1 hour.Is there a scheduling mechanism in hadoop to handle thsese

Please reply.
Balamurali

On Tue, Jul 23, 2013 at 12:24 PM, Mohammad Tariq <do...@gmail.com>> wrote:
Hello Sandeep,

You don't have to convert the data in order to copy it into the HDFS. But you might have to think about the MR processing of these files because of the format of these files.

You could probably make use of Sqoop<http://sqoop.apache.org/>.

I also came across DMX-H a few days ago while browsing. I don't know anything about the licensing and how good it is. Just thought of sharing it with you. You can visit their page<http://www.syncsort.com/en/Data-Integration/Home> to see more. They also provide a VM(includes CDH) to get started quickly.

Warm Regards,
Tariq
cloudfront.blogspot.com<http://cloudfront.blogspot.com>

On Tue, Jul 23, 2013 at 11:54 AM, Sandeep Nemuri <nh...@gmail.com>> wrote:
Hi ,

"How to copy datasets from Mainframe to HDFS directly?  I know that we can NDM files to Linux box and then we can use simple put command to copy data to HDFS.  But, how to copy data directly from mainframe to HDFS?  I have PS, PDS and VSAM datasets to copy to HDFS for analysis using MapReduce.

Also, Do we need to convert data from EBCDIC to ASCII before copy? "

--
--Regards
  Sandeep Nemuri



RE: Copy data from Mainframe to HDFS

Posted by Devaraj k <de...@huawei.com>.
Hi Balamurali,

As per my knowledge, there is nothing in the hadoop which does exactly as per your requirement.

You can write mapreduce jobs according to your functionality and submit hourly/daily/weekly or monthly . And then you can aggregate the results.

If you want some help regarding Hive, you can ask the same in Hive mailing list.



Thanks
Devaraj k

From: Balamurali [mailto:balamurali102@gmail.com]
Sent: 23 July 2013 12:42
To: user
Subject: Re: Copy data from Mainframe to HDFS

Hi,

I configured hadoop-1.0.3, hbase-0.92.1 and hive-0.10.0 .
Created table in HBase.Inserted records.Processing the data using Hive.
I have to show a graph with some points ( 7 - 7 days or 12 for one year).In one day records may include 1000 - lacks.I need to show average of these 1000 - lacks records.is<http://records.is> there any built in haddop mechanism to process these records fast.
Also I need to run a hive query  or job (when we run a hive query actually a job is submitting) in every 1 hour.Is there a scheduling mechanism in hadoop to handle thsese

Please reply.
Balamurali

On Tue, Jul 23, 2013 at 12:24 PM, Mohammad Tariq <do...@gmail.com>> wrote:
Hello Sandeep,

You don't have to convert the data in order to copy it into the HDFS. But you might have to think about the MR processing of these files because of the format of these files.

You could probably make use of Sqoop<http://sqoop.apache.org/>.

I also came across DMX-H a few days ago while browsing. I don't know anything about the licensing and how good it is. Just thought of sharing it with you. You can visit their page<http://www.syncsort.com/en/Data-Integration/Home> to see more. They also provide a VM(includes CDH) to get started quickly.

Warm Regards,
Tariq
cloudfront.blogspot.com<http://cloudfront.blogspot.com>

On Tue, Jul 23, 2013 at 11:54 AM, Sandeep Nemuri <nh...@gmail.com>> wrote:
Hi ,

"How to copy datasets from Mainframe to HDFS directly?  I know that we can NDM files to Linux box and then we can use simple put command to copy data to HDFS.  But, how to copy data directly from mainframe to HDFS?  I have PS, PDS and VSAM datasets to copy to HDFS for analysis using MapReduce.

Also, Do we need to convert data from EBCDIC to ASCII before copy? "

--
--Regards
  Sandeep Nemuri



RE: Copy data from Mainframe to HDFS

Posted by Devaraj k <de...@huawei.com>.
Hi Balamurali,

As per my knowledge, there is nothing in the hadoop which does exactly as per your requirement.

You can write mapreduce jobs according to your functionality and submit hourly/daily/weekly or monthly . And then you can aggregate the results.

If you want some help regarding Hive, you can ask the same in Hive mailing list.



Thanks
Devaraj k

From: Balamurali [mailto:balamurali102@gmail.com]
Sent: 23 July 2013 12:42
To: user
Subject: Re: Copy data from Mainframe to HDFS

Hi,

I configured hadoop-1.0.3, hbase-0.92.1 and hive-0.10.0 .
Created table in HBase.Inserted records.Processing the data using Hive.
I have to show a graph with some points ( 7 - 7 days or 12 for one year).In one day records may include 1000 - lacks.I need to show average of these 1000 - lacks records.is<http://records.is> there any built in haddop mechanism to process these records fast.
Also I need to run a hive query  or job (when we run a hive query actually a job is submitting) in every 1 hour.Is there a scheduling mechanism in hadoop to handle thsese

Please reply.
Balamurali

On Tue, Jul 23, 2013 at 12:24 PM, Mohammad Tariq <do...@gmail.com>> wrote:
Hello Sandeep,

You don't have to convert the data in order to copy it into the HDFS. But you might have to think about the MR processing of these files because of the format of these files.

You could probably make use of Sqoop<http://sqoop.apache.org/>.

I also came across DMX-H a few days ago while browsing. I don't know anything about the licensing and how good it is. Just thought of sharing it with you. You can visit their page<http://www.syncsort.com/en/Data-Integration/Home> to see more. They also provide a VM(includes CDH) to get started quickly.

Warm Regards,
Tariq
cloudfront.blogspot.com<http://cloudfront.blogspot.com>

On Tue, Jul 23, 2013 at 11:54 AM, Sandeep Nemuri <nh...@gmail.com>> wrote:
Hi ,

"How to copy datasets from Mainframe to HDFS directly?  I know that we can NDM files to Linux box and then we can use simple put command to copy data to HDFS.  But, how to copy data directly from mainframe to HDFS?  I have PS, PDS and VSAM datasets to copy to HDFS for analysis using MapReduce.

Also, Do we need to convert data from EBCDIC to ASCII before copy? "

--
--Regards
  Sandeep Nemuri



Re: Copy data from Mainframe to HDFS

Posted by Balamurali <ba...@gmail.com>.
Hi,

I configured hadoop-1.0.3, hbase-0.92.1 and hive-0.10.0 .
Created table in HBase.Inserted records.Processing the data using Hive.
I have to show a graph with some points ( 7 - 7 days or 12 for one year).In
one day records may include 1000 - lacks.I need to show average of these
1000 - lacks records.is there any built in haddop mechanism to process
these records fast.

Also I need to run a hive query  or job (when we run a hive query actually
a job is submitting) in every 1 hour.Is there a scheduling mechanism in
hadoop to handle thsese


Please reply.
Balamurali


On Tue, Jul 23, 2013 at 12:24 PM, Mohammad Tariq <do...@gmail.com> wrote:

> Hello Sandeep,
>
> You don't have to convert the data in order to copy it into the HDFS. But
> you might have to think about the MR processing of these files because of
> the format of these files.
>
> You could probably make use of Sqoop <http://sqoop.apache.org/>.
>
> I also came across DMX-H a few days ago while browsing. I don't know
> anything about the licensing and how good it is. Just thought of sharing it
> with you. You can visit their page<http://www.syncsort.com/en/Data-Integration/Home>to see more. They also provide a VM(includes CDH) to get started quickly.
>
> Warm Regards,
> Tariq
> cloudfront.blogspot.com
>
>
> On Tue, Jul 23, 2013 at 11:54 AM, Sandeep Nemuri <nh...@gmail.com>wrote:
>
>> Hi ,
>>
>> "How to copy datasets from Mainframe to HDFS directly?  I know that we
>> can NDM files to Linux box and then we can use simple put command to copy
>> data to HDFS.  But, how to copy data directly from mainframe to HDFS?  I
>> have PS, PDS and VSAM datasets to copy to HDFS for analysis using
>> MapReduce.
>>
>> Also, Do we need to convert data from EBCDIC to ASCII before copy? "
>>
>> --
>> --Regards
>>   Sandeep Nemuri
>>
>
>

Re: Copy data from Mainframe to HDFS

Posted by Raj K Singh <ra...@gmail.com>.
in mainframe you can have 3 type of datasources
--flat files
--VSAM files
--DB2/IMS

DB2 or IMS supprt the export utilities to copy the data into flat file
which you can get through ftp/sftp
VSAM file can be exported using IDCAMS utility
flat files can be get using the ftp/sftp

::::::::::::::::::::::::::::::::::::::::
Raj K Singh
http://www.rajkrrsingh.blogspot.com
Mobile  Tel: +91 (0)9899821370


On Tue, Jul 23, 2013 at 12:24 PM, Mohammad Tariq <do...@gmail.com> wrote:

> Hello Sandeep,
>
> You don't have to convert the data in order to copy it into the HDFS. But
> you might have to think about the MR processing of these files because of
> the format of these files.
>
> You could probably make use of Sqoop <http://sqoop.apache.org/>.
>
> I also came across DMX-H a few days ago while browsing. I don't know
> anything about the licensing and how good it is. Just thought of sharing it
> with you. You can visit their page<http://www.syncsort.com/en/Data-Integration/Home>to see more. They also provide a VM(includes CDH) to get started quickly.
>
> Warm Regards,
> Tariq
> cloudfront.blogspot.com
>
>
> On Tue, Jul 23, 2013 at 11:54 AM, Sandeep Nemuri <nh...@gmail.com>wrote:
>
>> Hi ,
>>
>> "How to copy datasets from Mainframe to HDFS directly?  I know that we
>> can NDM files to Linux box and then we can use simple put command to copy
>> data to HDFS.  But, how to copy data directly from mainframe to HDFS?  I
>> have PS, PDS and VSAM datasets to copy to HDFS for analysis using
>> MapReduce.
>>
>> Also, Do we need to convert data from EBCDIC to ASCII before copy? "
>>
>> --
>> --Regards
>>   Sandeep Nemuri
>>
>
>

Re: Copy data from Mainframe to HDFS

Posted by Raj K Singh <ra...@gmail.com>.
in mainframe you can have 3 type of datasources
--flat files
--VSAM files
--DB2/IMS

DB2 or IMS supprt the export utilities to copy the data into flat file
which you can get through ftp/sftp
VSAM file can be exported using IDCAMS utility
flat files can be get using the ftp/sftp

::::::::::::::::::::::::::::::::::::::::
Raj K Singh
http://www.rajkrrsingh.blogspot.com
Mobile  Tel: +91 (0)9899821370


On Tue, Jul 23, 2013 at 12:24 PM, Mohammad Tariq <do...@gmail.com> wrote:

> Hello Sandeep,
>
> You don't have to convert the data in order to copy it into the HDFS. But
> you might have to think about the MR processing of these files because of
> the format of these files.
>
> You could probably make use of Sqoop <http://sqoop.apache.org/>.
>
> I also came across DMX-H a few days ago while browsing. I don't know
> anything about the licensing and how good it is. Just thought of sharing it
> with you. You can visit their page<http://www.syncsort.com/en/Data-Integration/Home>to see more. They also provide a VM(includes CDH) to get started quickly.
>
> Warm Regards,
> Tariq
> cloudfront.blogspot.com
>
>
> On Tue, Jul 23, 2013 at 11:54 AM, Sandeep Nemuri <nh...@gmail.com>wrote:
>
>> Hi ,
>>
>> "How to copy datasets from Mainframe to HDFS directly?  I know that we
>> can NDM files to Linux box and then we can use simple put command to copy
>> data to HDFS.  But, how to copy data directly from mainframe to HDFS?  I
>> have PS, PDS and VSAM datasets to copy to HDFS for analysis using
>> MapReduce.
>>
>> Also, Do we need to convert data from EBCDIC to ASCII before copy? "
>>
>> --
>> --Regards
>>   Sandeep Nemuri
>>
>
>

Re: Copy data from Mainframe to HDFS

Posted by Raj K Singh <ra...@gmail.com>.
in mainframe you can have 3 type of datasources
--flat files
--VSAM files
--DB2/IMS

DB2 or IMS supprt the export utilities to copy the data into flat file
which you can get through ftp/sftp
VSAM file can be exported using IDCAMS utility
flat files can be get using the ftp/sftp

::::::::::::::::::::::::::::::::::::::::
Raj K Singh
http://www.rajkrrsingh.blogspot.com
Mobile  Tel: +91 (0)9899821370


On Tue, Jul 23, 2013 at 12:24 PM, Mohammad Tariq <do...@gmail.com> wrote:

> Hello Sandeep,
>
> You don't have to convert the data in order to copy it into the HDFS. But
> you might have to think about the MR processing of these files because of
> the format of these files.
>
> You could probably make use of Sqoop <http://sqoop.apache.org/>.
>
> I also came across DMX-H a few days ago while browsing. I don't know
> anything about the licensing and how good it is. Just thought of sharing it
> with you. You can visit their page<http://www.syncsort.com/en/Data-Integration/Home>to see more. They also provide a VM(includes CDH) to get started quickly.
>
> Warm Regards,
> Tariq
> cloudfront.blogspot.com
>
>
> On Tue, Jul 23, 2013 at 11:54 AM, Sandeep Nemuri <nh...@gmail.com>wrote:
>
>> Hi ,
>>
>> "How to copy datasets from Mainframe to HDFS directly?  I know that we
>> can NDM files to Linux box and then we can use simple put command to copy
>> data to HDFS.  But, how to copy data directly from mainframe to HDFS?  I
>> have PS, PDS and VSAM datasets to copy to HDFS for analysis using
>> MapReduce.
>>
>> Also, Do we need to convert data from EBCDIC to ASCII before copy? "
>>
>> --
>> --Regards
>>   Sandeep Nemuri
>>
>
>

Re: Copy data from Mainframe to HDFS

Posted by Raj K Singh <ra...@gmail.com>.
in mainframe you can have 3 type of datasources
--flat files
--VSAM files
--DB2/IMS

DB2 or IMS supprt the export utilities to copy the data into flat file
which you can get through ftp/sftp
VSAM file can be exported using IDCAMS utility
flat files can be get using the ftp/sftp

::::::::::::::::::::::::::::::::::::::::
Raj K Singh
http://www.rajkrrsingh.blogspot.com
Mobile  Tel: +91 (0)9899821370


On Tue, Jul 23, 2013 at 12:24 PM, Mohammad Tariq <do...@gmail.com> wrote:

> Hello Sandeep,
>
> You don't have to convert the data in order to copy it into the HDFS. But
> you might have to think about the MR processing of these files because of
> the format of these files.
>
> You could probably make use of Sqoop <http://sqoop.apache.org/>.
>
> I also came across DMX-H a few days ago while browsing. I don't know
> anything about the licensing and how good it is. Just thought of sharing it
> with you. You can visit their page<http://www.syncsort.com/en/Data-Integration/Home>to see more. They also provide a VM(includes CDH) to get started quickly.
>
> Warm Regards,
> Tariq
> cloudfront.blogspot.com
>
>
> On Tue, Jul 23, 2013 at 11:54 AM, Sandeep Nemuri <nh...@gmail.com>wrote:
>
>> Hi ,
>>
>> "How to copy datasets from Mainframe to HDFS directly?  I know that we
>> can NDM files to Linux box and then we can use simple put command to copy
>> data to HDFS.  But, how to copy data directly from mainframe to HDFS?  I
>> have PS, PDS and VSAM datasets to copy to HDFS for analysis using
>> MapReduce.
>>
>> Also, Do we need to convert data from EBCDIC to ASCII before copy? "
>>
>> --
>> --Regards
>>   Sandeep Nemuri
>>
>
>

Re: Copy data from Mainframe to HDFS

Posted by Balamurali <ba...@gmail.com>.
Hi,

I configured hadoop-1.0.3, hbase-0.92.1 and hive-0.10.0 .
Created table in HBase.Inserted records.Processing the data using Hive.
I have to show a graph with some points ( 7 - 7 days or 12 for one year).In
one day records may include 1000 - lacks.I need to show average of these
1000 - lacks records.is there any built in haddop mechanism to process
these records fast.

Also I need to run a hive query  or job (when we run a hive query actually
a job is submitting) in every 1 hour.Is there a scheduling mechanism in
hadoop to handle thsese


Please reply.
Balamurali


On Tue, Jul 23, 2013 at 12:24 PM, Mohammad Tariq <do...@gmail.com> wrote:

> Hello Sandeep,
>
> You don't have to convert the data in order to copy it into the HDFS. But
> you might have to think about the MR processing of these files because of
> the format of these files.
>
> You could probably make use of Sqoop <http://sqoop.apache.org/>.
>
> I also came across DMX-H a few days ago while browsing. I don't know
> anything about the licensing and how good it is. Just thought of sharing it
> with you. You can visit their page<http://www.syncsort.com/en/Data-Integration/Home>to see more. They also provide a VM(includes CDH) to get started quickly.
>
> Warm Regards,
> Tariq
> cloudfront.blogspot.com
>
>
> On Tue, Jul 23, 2013 at 11:54 AM, Sandeep Nemuri <nh...@gmail.com>wrote:
>
>> Hi ,
>>
>> "How to copy datasets from Mainframe to HDFS directly?  I know that we
>> can NDM files to Linux box and then we can use simple put command to copy
>> data to HDFS.  But, how to copy data directly from mainframe to HDFS?  I
>> have PS, PDS and VSAM datasets to copy to HDFS for analysis using
>> MapReduce.
>>
>> Also, Do we need to convert data from EBCDIC to ASCII before copy? "
>>
>> --
>> --Regards
>>   Sandeep Nemuri
>>
>
>

Re: Copy data from Mainframe to HDFS

Posted by Balamurali <ba...@gmail.com>.
Hi,

I configured hadoop-1.0.3, hbase-0.92.1 and hive-0.10.0 .
Created table in HBase.Inserted records.Processing the data using Hive.
I have to show a graph with some points ( 7 - 7 days or 12 for one year).In
one day records may include 1000 - lacks.I need to show average of these
1000 - lacks records.is there any built in haddop mechanism to process
these records fast.

Also I need to run a hive query  or job (when we run a hive query actually
a job is submitting) in every 1 hour.Is there a scheduling mechanism in
hadoop to handle thsese


Please reply.
Balamurali


On Tue, Jul 23, 2013 at 12:24 PM, Mohammad Tariq <do...@gmail.com> wrote:

> Hello Sandeep,
>
> You don't have to convert the data in order to copy it into the HDFS. But
> you might have to think about the MR processing of these files because of
> the format of these files.
>
> You could probably make use of Sqoop <http://sqoop.apache.org/>.
>
> I also came across DMX-H a few days ago while browsing. I don't know
> anything about the licensing and how good it is. Just thought of sharing it
> with you. You can visit their page<http://www.syncsort.com/en/Data-Integration/Home>to see more. They also provide a VM(includes CDH) to get started quickly.
>
> Warm Regards,
> Tariq
> cloudfront.blogspot.com
>
>
> On Tue, Jul 23, 2013 at 11:54 AM, Sandeep Nemuri <nh...@gmail.com>wrote:
>
>> Hi ,
>>
>> "How to copy datasets from Mainframe to HDFS directly?  I know that we
>> can NDM files to Linux box and then we can use simple put command to copy
>> data to HDFS.  But, how to copy data directly from mainframe to HDFS?  I
>> have PS, PDS and VSAM datasets to copy to HDFS for analysis using
>> MapReduce.
>>
>> Also, Do we need to convert data from EBCDIC to ASCII before copy? "
>>
>> --
>> --Regards
>>   Sandeep Nemuri
>>
>
>

Re: Copy data from Mainframe to HDFS

Posted by Balamurali <ba...@gmail.com>.
Hi,

I configured hadoop-1.0.3, hbase-0.92.1 and hive-0.10.0 .
Created table in HBase.Inserted records.Processing the data using Hive.
I have to show a graph with some points ( 7 - 7 days or 12 for one year).In
one day records may include 1000 - lacks.I need to show average of these
1000 - lacks records.is there any built in haddop mechanism to process
these records fast.

Also I need to run a hive query  or job (when we run a hive query actually
a job is submitting) in every 1 hour.Is there a scheduling mechanism in
hadoop to handle thsese


Please reply.
Balamurali


On Tue, Jul 23, 2013 at 12:24 PM, Mohammad Tariq <do...@gmail.com> wrote:

> Hello Sandeep,
>
> You don't have to convert the data in order to copy it into the HDFS. But
> you might have to think about the MR processing of these files because of
> the format of these files.
>
> You could probably make use of Sqoop <http://sqoop.apache.org/>.
>
> I also came across DMX-H a few days ago while browsing. I don't know
> anything about the licensing and how good it is. Just thought of sharing it
> with you. You can visit their page<http://www.syncsort.com/en/Data-Integration/Home>to see more. They also provide a VM(includes CDH) to get started quickly.
>
> Warm Regards,
> Tariq
> cloudfront.blogspot.com
>
>
> On Tue, Jul 23, 2013 at 11:54 AM, Sandeep Nemuri <nh...@gmail.com>wrote:
>
>> Hi ,
>>
>> "How to copy datasets from Mainframe to HDFS directly?  I know that we
>> can NDM files to Linux box and then we can use simple put command to copy
>> data to HDFS.  But, how to copy data directly from mainframe to HDFS?  I
>> have PS, PDS and VSAM datasets to copy to HDFS for analysis using
>> MapReduce.
>>
>> Also, Do we need to convert data from EBCDIC to ASCII before copy? "
>>
>> --
>> --Regards
>>   Sandeep Nemuri
>>
>
>

Re: Copy data from Mainframe to HDFS

Posted by Mohammad Tariq <do...@gmail.com>.
Hello Sandeep,

You don't have to convert the data in order to copy it into the HDFS. But
you might have to think about the MR processing of these files because of
the format of these files.

You could probably make use of Sqoop <http://sqoop.apache.org/>.

I also came across DMX-H a few days ago while browsing. I don't know
anything about the licensing and how good it is. Just thought of sharing it
with you. You can visit their
page<http://www.syncsort.com/en/Data-Integration/Home>to see more.
They also provide a VM(includes CDH) to get started quickly.

Warm Regards,
Tariq
cloudfront.blogspot.com


On Tue, Jul 23, 2013 at 11:54 AM, Sandeep Nemuri <nh...@gmail.com>wrote:

> Hi ,
>
> "How to copy datasets from Mainframe to HDFS directly?  I know that we can
> NDM files to Linux box and then we can use simple put command to copy data
> to HDFS.  But, how to copy data directly from mainframe to HDFS?  I have
> PS, PDS and VSAM datasets to copy to HDFS for analysis using MapReduce.
>
> Also, Do we need to convert data from EBCDIC to ASCII before copy? "
>
> --
> --Regards
>   Sandeep Nemuri
>

Re: Copy data from Mainframe to HDFS

Posted by Mohammad Tariq <do...@gmail.com>.
Hello Sandeep,

You don't have to convert the data in order to copy it into the HDFS. But
you might have to think about the MR processing of these files because of
the format of these files.

You could probably make use of Sqoop <http://sqoop.apache.org/>.

I also came across DMX-H a few days ago while browsing. I don't know
anything about the licensing and how good it is. Just thought of sharing it
with you. You can visit their
page<http://www.syncsort.com/en/Data-Integration/Home>to see more.
They also provide a VM(includes CDH) to get started quickly.

Warm Regards,
Tariq
cloudfront.blogspot.com


On Tue, Jul 23, 2013 at 11:54 AM, Sandeep Nemuri <nh...@gmail.com>wrote:

> Hi ,
>
> "How to copy datasets from Mainframe to HDFS directly?  I know that we can
> NDM files to Linux box and then we can use simple put command to copy data
> to HDFS.  But, how to copy data directly from mainframe to HDFS?  I have
> PS, PDS and VSAM datasets to copy to HDFS for analysis using MapReduce.
>
> Also, Do we need to convert data from EBCDIC to ASCII before copy? "
>
> --
> --Regards
>   Sandeep Nemuri
>

Re: Copy data from Mainframe to HDFS

Posted by Mohammad Tariq <do...@gmail.com>.
Hello Sandeep,

You don't have to convert the data in order to copy it into the HDFS. But
you might have to think about the MR processing of these files because of
the format of these files.

You could probably make use of Sqoop <http://sqoop.apache.org/>.

I also came across DMX-H a few days ago while browsing. I don't know
anything about the licensing and how good it is. Just thought of sharing it
with you. You can visit their
page<http://www.syncsort.com/en/Data-Integration/Home>to see more.
They also provide a VM(includes CDH) to get started quickly.

Warm Regards,
Tariq
cloudfront.blogspot.com


On Tue, Jul 23, 2013 at 11:54 AM, Sandeep Nemuri <nh...@gmail.com>wrote:

> Hi ,
>
> "How to copy datasets from Mainframe to HDFS directly?  I know that we can
> NDM files to Linux box and then we can use simple put command to copy data
> to HDFS.  But, how to copy data directly from mainframe to HDFS?  I have
> PS, PDS and VSAM datasets to copy to HDFS for analysis using MapReduce.
>
> Also, Do we need to convert data from EBCDIC to ASCII before copy? "
>
> --
> --Regards
>   Sandeep Nemuri
>

Re: Copy data from Mainframe to HDFS

Posted by Mohammad Tariq <do...@gmail.com>.
Hello Sandeep,

You don't have to convert the data in order to copy it into the HDFS. But
you might have to think about the MR processing of these files because of
the format of these files.

You could probably make use of Sqoop <http://sqoop.apache.org/>.

I also came across DMX-H a few days ago while browsing. I don't know
anything about the licensing and how good it is. Just thought of sharing it
with you. You can visit their
page<http://www.syncsort.com/en/Data-Integration/Home>to see more.
They also provide a VM(includes CDH) to get started quickly.

Warm Regards,
Tariq
cloudfront.blogspot.com


On Tue, Jul 23, 2013 at 11:54 AM, Sandeep Nemuri <nh...@gmail.com>wrote:

> Hi ,
>
> "How to copy datasets from Mainframe to HDFS directly?  I know that we can
> NDM files to Linux box and then we can use simple put command to copy data
> to HDFS.  But, how to copy data directly from mainframe to HDFS?  I have
> PS, PDS and VSAM datasets to copy to HDFS for analysis using MapReduce.
>
> Also, Do we need to convert data from EBCDIC to ASCII before copy? "
>
> --
> --Regards
>   Sandeep Nemuri
>