You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by Dhaya007 <mg...@gmail.com> on 2007/12/21 05:50:12 UTC

Examples for storing data in to Multi node cluster

I am Doing R&D on hadoop 
My requirement is to store huge datas and retrive data by search,I have
searched on the web that Hadoop is the best solution.
If any one have this kind of Document (some examples to store the datas in
multinode environment )
please share the document and help me on this


-- 
View this message in context: http://www.nabble.com/Examples-for-storing-data-in-to-Multi-node-cluster-tp14450333p14450333.html
Sent from the Hadoop Users mailing list archive at Nabble.com.

Re: Examples for storing data in to Multi node cluster

Posted by Dhaya007 <mg...@gmail.com>.

Thanks for your reply,
I am new to hadoop if any think wrong please correct me ....
I already started looking on those websites and i configured a single/multi
node cluster using the link
http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Single-Node_Cluster%29.
I have followed the link but i am not able to start the haoop in multi node
environment
The problems i am facing are as Follows:
1.I have configured master and slave nodes with ssh less pharase if try to
run the start-dfs.sh it prompt the password for master:slave machines.(I
have copied the .ssh/id_rsa.pub key of master in to slaves autherized_key
file)

2.After giving password datanode,namenode,jobtracker,tasktraker started
successfully in master but datanode is started in slave.

3.Some time step 2 works and some time it says that permission denied.

4.I have checked the log file in the slave for datanode it says that
incompatible node, then i have formated the slave, master and start the dfs
by start-dfs.sh still i am getting the error

The host entry in etc/hosts are both master/slave
master
slave
conf/masters
master
conf/slaves
master
slave

The hadoop-site.xml  for both master/slave
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>
<property>
  <name>hadoop.tmp.dir</name>
  <value>/home/hdusr/hadoop-${user.name}</value>
  <description>A base for other temporary directories.</description>
</property>

<property>
  <name>fs.default.name</name>
  <value>hdfs://master:54310</value>
  <description>The name of the default file system.  A URI whose
  scheme and authority determine the FileSystem implementation.  The
  uri's scheme determines the config property (fs.SCHEME.impl) naming
  the FileSystem implementation class.  The uri's authority is used to
  determine the host, port, etc. for a filesystem.</description>
</property>

<property>
  <name>mapred.job.tracker</name>
  <value>master:54311</value>
  <description>The host and port that the MapReduce job tracker runs
  at.  If "local", then jobs are run in-process as a single map
  and reduce task.
  </description>
</property>

<property>
  <name>dfs.replication</name>
  <value>2</value>
  <description>Default block replication.
  The actual number of replications can be specified when the file is
created.
  The default is used if replication is not specified in create time.
  </description>
</property>

<property>
  <name>mapred.map.tasks</name>
  <value>20</value>
  <description>As a rule of thumb, use 10x the number of slaves (i.e.,
number of tasktrackers).
  </description>
</property>

<property>
  <name>mapred.reduce.tasks</name>
  <value>4</value>
  <description>As a rule of thumb, use 2x the number of slave processors
(i.e., number of tasktrackers).
  </description>
</property>
</configuration>

Please help me to reslove the same. Or else provide any other tutorial for
multi node cluster setup.I am egarly waiting for the tutorials.

Thanks

Khalil Honsali wrote:
> 
> I think you are referring to storage on Parallel and/or Distributed File
> Systems, Hadoop is build on top of a Google FS like file system called
> HDFS.
> All hadoop related info are on : http://lucene.apache.org/hadoop/
> Please start with this first:
> http://wiki.apache.org/lucene-hadoop/ImportantConcepts
> 
> 
> On 21/12/2007, Dhaya007 <mg...@gmail.com> wrote:
>>
>>
>> I am Doing R&D on hadoop
>> My requirement is to store huge datas and retrive data by search,I have
>> searched on the web that Hadoop is the best solution.
>> If any one have this kind of Document (some examples to store the datas
>> in
>> multinode environment )
>> please share the document and help me on this
>>
>>
>> --
>> View this message in context:
>> http://www.nabble.com/Examples-for-storing-data-in-to-Multi-node-cluster-tp14450333p14450333.html
>> Sent from the Hadoop Users mailing list archive at Nabble.com.
>>
>>
> 
> 

-- 
View this message in context: http://www.nabble.com/Examples-for-storing-data-in-to-Multi-node-cluster-tp14450333p14453550.html
Sent from the Hadoop Users mailing list archive at Nabble.com.

Re: Examples for storing data in to Multi node cluster

Posted by Khalil Honsali <k....@gmail.com>.

I think you are referring to storage on Parallel and/or Distributed File
Systems, Hadoop is build on top of a Google FS like file system called HDFS.
All hadoop related info are on : http://lucene.apache.org/hadoop/
Please start with this first:
http://wiki.apache.org/lucene-hadoop/ImportantConcepts

On 21/12/2007, Dhaya007 <mg...@gmail.com> wrote:
>
>
> I am Doing R&D on hadoop
> My requirement is to store huge datas and retrive data by search,I have
> searched on the web that Hadoop is the best solution.
> If any one have this kind of Document (some examples to store the datas in
> multinode environment )
> please share the document and help me on this
>
>
> --
> View this message in context:
> http://www.nabble.com/Examples-for-storing-data-in-to-Multi-node-cluster-tp14450333p14450333.html
> Sent from the Hadoop Users mailing list archive at Nabble.com.
>
>