You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Jagadeesh <ja...@gmail.com> on 2006/12/18 07:00:11 UTC

Urgent: Production Issues

Hi All,

I am running Hadoop 0.7.2 in a production environment and it has stored
~170GB of data. Please read below the deployment architecture I am using.

I am using 4 nodes with 1.3TB storage each and the master node is not being
used for storage. So I have 5 servers in total out of which 4 servers are
running Hadoop nodes. This setup was working fine for the last 20-25 days
and there were no issues. As mentioned earlier, now the total storage has
gone upto ~170GB. 

Couple of days back, I noticed an error where Hadoop was not accepting new
files, I mean the upload always failed, but download was still working
great. I was getting the exception, writing <filename>.crc failed. When I
tried restarting the service, I was getting the message, jobtracker not
available and tasktracker not available. Then I had to kill all the
processes in the master node as well as in the client nodes to restart the
service.

After that everything worked fine for a day more and now I keep on getting
the message 

failure closing block of file /user/root/.LICENSE.txt2233331.crc to node
node1:50010

Even if I restart the service, I get this message after 10 minutes.

I read in the mailing list that this issues is resolved in 0.9.0, but I am a
bit skeptical about moving to 0.9.0 as I don't know whether I will end up
loosing the files that are already stored. Kindly confirm this and I wil
move to 0.9.0 and also please tell me the steps or pre-cautions I should
take before moving to 0.9.0.

Thanks and Regards
Jugs