You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Richard Zhang <ri...@gmail.com> on 2008/06/03 04:03:29 UTC

create less than 10G data/host with RandomWrite

Hello Hadoopers:
I am running the RandomWrite on a 8 nodes cluster. Because the default
setting is creating 1G/mapper, 10mappers/host. Considering replications, it
is essentially creating 30G/host. Because each node in the cluster has at
most 30G. So my cluster is full and can not execute further command. I
create a new application configuration file specifying 1G/mapper. But it
seems it is still creating 30G data and still running out of each node's
disk. Is this the right way to generate the less than 10G data with
RandomWriter? Below is the command and application configuration file I
used.

 bin/hadoop jar hadoop-0.17.0-examples.jar randomwriter rand  -conf
randConfig.xml

Below is the application configuration file I am using:

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
<property>
  <name>test.randomwriter.maps_per_host</name>
  <value>10</value>
  <description></description>
</property>

<property>
  <name>test.randomwrite.bytes_per_map</name>
  <value>103741824</value>
  <description></description>
</property>

<property>
  <name>test.randomwrite.min_key</name>
  <value>10</value>
  <description>
  </description>
</property>

<property>
  <name>test.randomwrite.max_key</name>
 <value>1000</value>
 <description>
  </description>
</property>

<property>
  <name>test.randomwrite.min_value</name>
 <value>0</value>
 <description>Default block replication.
  </description>
</property>

<property>
  <name>test.randomwrite.max_value</name>
 <value>20000</value>
 <description>Default block replication.
  </description>
</property>
</configuration>