You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by "Arthur.hk.chan@gmail.com" <ar...@gmail.com> on 2014/09/22 11:14:14 UTC

To Generate Test Data in HDFS (PDGF)

Hi,

I need to generate large amount of test data (4TB) into Hadoop, has anyone used PDGF to do so? Could you share your cook book about PDGF in Hadoop (or HBase)? 

Many Thanks
Arthur

Can HDFS be used as a network drive for Windows Client and use ActiveDirectory for access control

Posted by "Arthur.hk.chan@gmail.com" <ar...@gmail.com>.
Hi,

Have anyone used Hadoop as a network driver for Windows client and use Microsoft ActiveDirectory for access control? can you please share you cook book?

Regards
Arthur





Can HDFS be used as a network drive for Windows Client and use ActiveDirectory for access control

Posted by "Arthur.hk.chan@gmail.com" <ar...@gmail.com>.
Hi,

Have anyone used Hadoop as a network driver for Windows client and use Microsoft ActiveDirectory for access control? can you please share you cook book?

Regards
Arthur





Can HDFS be used as a network drive for Windows Client and use ActiveDirectory for access control

Posted by "Arthur.hk.chan@gmail.com" <ar...@gmail.com>.
Hi,

Have anyone used Hadoop as a network driver for Windows client and use Microsoft ActiveDirectory for access control? can you please share you cook book?

Regards
Arthur





Can HDFS be used as a network drive for Windows Client and use ActiveDirectory for access control

Posted by "Arthur.hk.chan@gmail.com" <ar...@gmail.com>.
Hi,

Have anyone used Hadoop as a network driver for Windows client and use Microsoft ActiveDirectory for access control? can you please share you cook book?

Regards
Arthur





Re: To Generate Test Data in HDFS (PDGF)

Posted by Jay Vyas <ja...@gmail.com>.
While on the subject,
You can also use the bigpetstore application to do this, in apache bigtop.  This data is suited well for hbase ( semi structured, transactional, and features some global patterns which can make for meaningful queries and so on).

Clone apache/bigtop
cd bigtop-bigpetstore
gradle clean package # build the jar

Then follow the instructions in the README to generate as many records as you want in a distributed context.  Each record is around 80 bytes, so about 10^10 records should be on the scale you are looking for.

> On Sep 22, 2014, at 5:14 AM, "Arthur.hk.chan@gmail.com" <ar...@gmail.com> wrote:
> 
> Hi,
> 
> I need to generate large amount of test data (4TB) into Hadoop, has anyone used PDGF to do so? Could you share your cook book about PDGF in Hadoop (or HBase)? 
> 
> Many Thanks
> Arthur

Re: To Generate Test Data in HDFS (PDGF)

Posted by Jay Vyas <ja...@gmail.com>.
While on the subject,
You can also use the bigpetstore application to do this, in apache bigtop.  This data is suited well for hbase ( semi structured, transactional, and features some global patterns which can make for meaningful queries and so on).

Clone apache/bigtop
cd bigtop-bigpetstore
gradle clean package # build the jar

Then follow the instructions in the README to generate as many records as you want in a distributed context.  Each record is around 80 bytes, so about 10^10 records should be on the scale you are looking for.

> On Sep 22, 2014, at 5:14 AM, "Arthur.hk.chan@gmail.com" <ar...@gmail.com> wrote:
> 
> Hi,
> 
> I need to generate large amount of test data (4TB) into Hadoop, has anyone used PDGF to do so? Could you share your cook book about PDGF in Hadoop (or HBase)? 
> 
> Many Thanks
> Arthur

Re: To Generate Test Data in HDFS (PDGF)

Posted by Jay Vyas <ja...@gmail.com>.
While on the subject,
You can also use the bigpetstore application to do this, in apache bigtop.  This data is suited well for hbase ( semi structured, transactional, and features some global patterns which can make for meaningful queries and so on).

Clone apache/bigtop
cd bigtop-bigpetstore
gradle clean package # build the jar

Then follow the instructions in the README to generate as many records as you want in a distributed context.  Each record is around 80 bytes, so about 10^10 records should be on the scale you are looking for.

> On Sep 22, 2014, at 5:14 AM, "Arthur.hk.chan@gmail.com" <ar...@gmail.com> wrote:
> 
> Hi,
> 
> I need to generate large amount of test data (4TB) into Hadoop, has anyone used PDGF to do so? Could you share your cook book about PDGF in Hadoop (or HBase)? 
> 
> Many Thanks
> Arthur

Re: To Generate Test Data in HDFS (PDGF)

Posted by Jay Vyas <ja...@gmail.com>.
While on the subject,
You can also use the bigpetstore application to do this, in apache bigtop.  This data is suited well for hbase ( semi structured, transactional, and features some global patterns which can make for meaningful queries and so on).

Clone apache/bigtop
cd bigtop-bigpetstore
gradle clean package # build the jar

Then follow the instructions in the README to generate as many records as you want in a distributed context.  Each record is around 80 bytes, so about 10^10 records should be on the scale you are looking for.

> On Sep 22, 2014, at 5:14 AM, "Arthur.hk.chan@gmail.com" <ar...@gmail.com> wrote:
> 
> Hi,
> 
> I need to generate large amount of test data (4TB) into Hadoop, has anyone used PDGF to do so? Could you share your cook book about PDGF in Hadoop (or HBase)? 
> 
> Many Thanks
> Arthur