You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@accumulo.apache.org by Aji Janis <aj...@gmail.com> on 2013/03/27 15:10:25 UTC

Waiting for accumulo to be initialized

Hello,

We have the following set up:

zookeeper - 3.3.3-1073969
hadoop - 0.20.203.0
accumulo - 1.4.2

Our zookeeper crashed for some reason. I tried to doing a clean stop of
everything and then brought up (in order) zookeeper and hadoop (cluster).
But when trying to do a start-all on accumulo I get the following message
gets infinitely printed to the screen:

“26 12:45:43,551 [server.Accumulo] INFO : Waiting for accumulo to be
initialized”



Doing some digging on the web it seems that accumulo is hosed and needs
some re-intialization. It also appears that may be I need to clean out
things from zookeeper and hadoop prior to a re-initialization. Has any one
done this before? Can someone please provide me some directions on what to
do (or not to do)? Really appreciate help on this. Thanks.

Re: Waiting for accumulo to be initialized

Posted by Aji Janis <aj...@gmail.com>.

Actually, this guide explains that running ./hadoop namenode -format would
cause this issue
http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/#javaioioexception-incompatible-namespaceids


On Wed, Mar 27, 2013 at 11:31 AM, Aji Janis <aj...@gmail.com> wrote:

> well... I found this in the datanode log
>
>  ERROR org.apache.hadoop.hdfs.server.datanode.DataNode:
> java.io.IOException: Incompatible namespaceIDs in
> /opt/hadoop-data/hadoop/hdfs/data: namenode namespaceID = 2089335599;
> datanode namespaceID = 1868050007
>
>
>
>
> On Wed, Mar 27, 2013 at 11:23 AM, Eric Newton <er...@gmail.com>wrote:
>
>> "0 live nodes"  that will continue to be a problem.
>>
>> Check the datanode logs.
>>
>> -Eric
>>
>>
>> On Wed, Mar 27, 2013 at 11:20 AM, Aji Janis <aj...@gmail.com> wrote:
>>
>>>
>>> I removed everything under /opt/hadoop-data/hadoop/hdfs/data/current/
>>> because it seemed like old files were hanging around and I had to remove
>>> them before I can start re-initialization.
>>>
>>>
>>> I didn't move anything to /tmp or try reboot.
>>> my old accumulo instance had everything under /accumulo (in hdfs) and
>>> its still there but i m guessing me deleting stuff from hadoop-data has
>>> deleted a bunch of its stuff.
>>>
>>> i tried to restart zookeeper and hadoop and it came up fine but now my
>>> namenode url says there 0 live nodes (instead of 5 in my cluster). Doing a
>>> ps -ef | grep hadoop on each node in cluster however shows that hadoop is
>>> running.... so i am not sure what I messed up. Suggestions?
>>>
>>> Have I lost accumulo for good? Should I just recreate the instance?
>>>
>>>
>>> On Wed, Mar 27, 2013 at 10:52 AM, Eric Newton <er...@gmail.com>wrote:
>>>
>>>> Your DataNode has not started and reported blocks to the NameNode.
>>>>
>>>> Did you store things (zookeeper, hadoop) in /tmp and reboot?  It's a
>>>> common thing to do, and it commonly deletes everything in /tmp.  If that's
>>>> the case, you will need to shutdown hdfs and run:
>>>>
>>>> $ hadoop namenode -format
>>>>
>>>> And then start hdfs again.
>>>>
>>>> -Eric
>>>>
>>>>
>>>> On Wed, Mar 27, 2013 at 10:47 AM, Aji Janis <aj...@gmail.com> wrote:
>>>>
>>>>> I see thank you. When I bring up hdfs (start-all from node with
>>>>> jobtracker) I see the following message on url:
>>>>> http://mynode:50070/dfshealth.jsp
>>>>>
>>>>> *Safe mode is ON. The ratio of reported blocks 0.0000 has not reached
>>>>> the threshold 0.9990. Safe mode will be turned off automatically.
>>>>> **2352 files and directories, 2179 blocks = 4531 total. Heap Size is
>>>>> 54 MB / 888.94 MB (6%) *
>>>>> *
>>>>> *
>>>>> Whats going on here?
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Mar 27, 2013 at 10:44 AM, Eric Newton <er...@gmail.com>wrote:
>>>>>
>>>>>> This will (eventually) delete everything created by accumulo in hfds:
>>>>>>
>>>>>> $ hadoop fs -rmr /accumulo
>>>>>>
>>>>>> Accumulo will create a new area to hold your configurations.
>>>>>>  Accumulo will basically abandon that old configuration.  There's a class
>>>>>> that can be used to clean up old accumulo instances in zookeeper:
>>>>>>
>>>>>> $ ./bin/accumulo org.apache.accumulo.server.util.CleanZookeeper
>>>>>> hostname:port
>>>>>>
>>>>>> Where "hostname:port" is the name of one of your zookeeper hosts.
>>>>>>
>>>>>> -Eric
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Mar 27, 2013 at 10:29 AM, Aji Janis <aj...@gmail.com>wrote:
>>>>>>
>>>>>>> Thanks Eric. But shouldn't I be cleaning up something in the
>>>>>>> hadoop-data directory too? and Zookeeper?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Mar 27, 2013 at 10:27 AM, Eric Newton <eric.newton@gmail.com
>>>>>>> > wrote:
>>>>>>>
>>>>>>>> To re-initialize accumulo, bring up zookeeper and hdfs.
>>>>>>>>
>>>>>>>> $ hadoop fs -rmr /accumulo
>>>>>>>> $ ./bin/accumulo init
>>>>>>>>
>>>>>>>> I do this about 100 times a day on my dev box. :-)
>>>>>>>>
>>>>>>>> -Eric
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Mar 27, 2013 at 10:10 AM, Aji Janis <aj...@gmail.com>wrote:
>>>>>>>>
>>>>>>>>> Hello,
>>>>>>>>>
>>>>>>>>> We have the following set up:
>>>>>>>>>
>>>>>>>>> zookeeper - 3.3.3-1073969
>>>>>>>>> hadoop - 0.20.203.0
>>>>>>>>> accumulo - 1.4.2
>>>>>>>>>
>>>>>>>>> Our zookeeper crashed for some reason. I tried to doing a clean
>>>>>>>>> stop of everything and then brought up (in order) zookeeper and hadoop
>>>>>>>>> (cluster). But when trying to do a start-all on accumulo I get the
>>>>>>>>> following message gets infinitely printed to the screen:
>>>>>>>>>
>>>>>>>>> “26 12:45:43,551 [server.Accumulo] INFO : Waiting for accumulo to
>>>>>>>>> be initialized”
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Doing some digging on the web it seems that accumulo is hosed and
>>>>>>>>> needs some re-intialization. It also appears that may be I need to clean
>>>>>>>>> out things from zookeeper and hadoop prior to a re-initialization. Has any
>>>>>>>>> one done this before? Can someone please provide me some directions on what
>>>>>>>>> to do (or not to do)? Really appreciate help on this. Thanks.
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Waiting for accumulo to be initialized

Posted by Aji Janis <aj...@gmail.com>.

Krishmin,

Thank you for the response. Its always great to hear from someone who has
tried out the steps (even if you had a different issue). Like you said I am
not really sure what caused the crash in our evn in the first place but
having a plan is always good...

Thanks again all,
Aji


On Wed, Mar 27, 2013 at 5:00 PM, Krishmin Rai <kr...@missionfoc.us> wrote:

> Hi Aji,
> I wrote the original question linked below (about re-initing Accumulo over
> an existing installation).  For what it's worth, I believe that my
> ZooKeeper data loss was related to the linux+java leap second bug<https://access.redhat.com/knowledge/articles/15145> -- not
> likely to be affecting you now (I did not go back and attempt to re-create
> the issue, so it's also possible there were other compounding issues). We
> have not encountered any ZK data-loss problems since.
>
> At the time, I did some basic experiments to understand the process
> better, and successfully followed (essentially) the steps Eric has
> described. The only real difficulty I had was identifying which directories
> corresponded to which tables; I ended up iterating over individual RFiles
> and manually identifying tables based on expected data. This was a somewhat
> painful process, but at least made me confident that it would be possible
> in production.
>
> It's also important to note that, at least according to my understanding,
> this procedure still potentially loses data: mutations written after the
> last minor compaction will only have reached the write-ahead-logs and will
> not be available in the raw RFiles you're importing from.
>
> -Krishmin
>
> On Mar 27, 2013, at 4:45 PM, Aji Janis wrote:
>
> Eric, Really appreciate you jotting this down. Too late to try it out this
> time but will give this a try (if, hopefully not) there is a next time to
> be had.
>
> Thanks again.
>
>
>
> On Wed, Mar 27, 2013 at 4:19 PM, Eric Newton <er...@gmail.com>wrote:
>
>> I should write this up in the user manual.  It's not that hard, but it's
>> really not the first thing you want to tackle while learning how to use
>> accumulo.  I just opened ACCUMULO-1217<https://issues.apache.org/jira/browse/ACCUMULO-1217> to
>> do that.
>>
>> I wrote this from memory: expect errors.  Needless to say, you would only
>> want to do this when you are more comfortable with hadoop, zookeeper and
>> accumulo.
>>
>> First, get zookeeper up and running, even if you have delete all its
>> data.
>>
>> Next, attempt to determine the mapping of table names to tableIds.  You
>> can do this in the shell when your accumulo instance is healthy.  If it
>> isn't healthy, you will have to guess based on the data in the files in
>> HDFS.
>>
>> So, for example, the table "trace" is probably table id "1".  You can
>> find the files for trace in /accumulo/tables/1.
>>
>> Don't worry if you get the names wrong.  You can always rename the tables
>> later.
>>
>> Move the old files for accumulo out of the way and re-initialize:
>>
>> $ hadoop fs -mv /accumulo /accumulo-old
>> $ ./bin/accumulo init
>> $ ./bin/start-all.sh
>>
>> Recreate your tables:
>>
>> $ ./bin/accumulo shell -u root -p mysecret
>> shell > createtable table1
>>
>> Learn the new table id mapping:
>> shell > tables -l
>> !METADATA => !0
>> trace => 1
>> table1 => 2
>> ...
>>
>> Bulk import all your data back into the new table ids:
>> Assuming you have determined that "table1" used to be table id "a" and is
>> now "2",
>> you do something like this:
>>
>> $ hadoop fs -mkdir /tmp/failed
>> $ ./bin/accumulo shell -u root -p mysecret
>> shell > table table1
>> shell table1 > importdirectory /accumulo-old/tables/a/default_tablet
>> /tmp/failed true
>>
>> There are lots of directories under every table id directory.  You will
>> need to import each of them.  I suggest creating a script and passing it to
>> the shell on the command line.
>>
>> I know of instances in which trillions of entries were recovered and
>> available in a matter of hours.
>>
>> -Eric
>>
>>
>>
>> On Wed, Mar 27, 2013 at 3:39 PM, Aji Janis <aj...@gmail.com> wrote:
>>
>>> when you say " you can move the files aside in HDFS" .. which files are
>>> you referring to? I have never set up zookeeper myself so I am not aware of
>>> all the changes needed.
>>>
>>>
>>>
>>> On Wed, Mar 27, 2013 at 3:33 PM, Eric Newton <er...@gmail.com>wrote:
>>>
>>>> If you lose zookeeper, you can move the files aside in HDFS, recreate
>>>> your instance in zookeeper and bulk import all of the old files.  It's not
>>>> perfect: you lose table configurations, split points and user permissions,
>>>> but you do preserve most of the data.
>>>>
>>>> You can back up each of these bits of information periodically if you
>>>> like.  Outside of the files in HDFS, the configuration information is
>>>> pretty small.
>>>>
>>>> -Eric
>>>>
>>>>
>>>>
>>>> On Wed, Mar 27, 2013 at 3:18 PM, Aji Janis <aj...@gmail.com> wrote:
>>>>
>>>>> Eric and Josh thanks for all your feedback. We ended up *loosing all
>>>>> our accumulo data* because I had to reformat hadoop. Here is in a
>>>>> nutshell what I did:
>>>>>
>>>>>
>>>>>    1. Stop accumulo
>>>>>    2. Stop hadoop
>>>>>    3. On hadoop master and all datanodes, from dfs.data.dir
>>>>>    (hdfs-site.xml) remove everything under the data folder
>>>>>    4. On hadoop master, from dfs.name.dir (hdfs-site.xml) remove
>>>>>    everything under the name folder
>>>>>    5. As hadoop user, execute.../hadoop/bin/hadoop namenode -format
>>>>>    6. As hadoop user, execute.../hadoop/bin/start-all.sh ==> should
>>>>>    populate data/ and name/ dirs that was erased in steps 3, 4.
>>>>>    7. Initialized Accumulo - as accumulo user,
>>>>>     ../accumulo/bin/accumulo init (I created a new instance)
>>>>>    8. Start accumulo
>>>>>
>>>>> I was wondering if anyone had suggestions or thoughts on how I could
>>>>> have solved the original issue of accumulo waiting initialization without
>>>>> loosing my accumulo data? Is it possible to do so?
>>>>>
>>>>
>>>>
>>>
>>
>
>

Re: Waiting for accumulo to be initialized

Posted by Krishmin Rai <kr...@missionfoc.us>.

Hi Aji,
I wrote the original question linked below (about re-initing Accumulo over an existing installation).  For what it's worth, I believe that my ZooKeeper data loss was related to the linux+java leap second bug -- not likely to be affecting you now (I did not go back and attempt to re-create the issue, so it's also possible there were other compounding issues). We have not encountered any ZK data-loss problems since. 

At the time, I did some basic experiments to understand the process better, and successfully followed (essentially) the steps Eric has described. The only real difficulty I had was identifying which directories corresponded to which tables; I ended up iterating over individual RFiles and manually identifying tables based on expected data. This was a somewhat painful process, but at least made me confident that it would be possible in production.

It's also important to note that, at least according to my understanding, this procedure still potentially loses data: mutations written after the last minor compaction will only have reached the write-ahead-logs and will not be available in the raw RFiles you're importing from.

-Krishmin

On Mar 27, 2013, at 4:45 PM, Aji Janis wrote:

> Eric, Really appreciate you jotting this down. Too late to try it out this time but will give this a try (if, hopefully not) there is a next time to be had. 
> 
> Thanks again.
> 
> 
> 
> On Wed, Mar 27, 2013 at 4:19 PM, Eric Newton <er...@gmail.com> wrote:
> I should write this up in the user manual.  It's not that hard, but it's really not the first thing you want to tackle while learning how to use accumulo.  I just opened ACCUMULO-1217 to do that.
> 
> I wrote this from memory: expect errors.  Needless to say, you would only want to do this when you are more comfortable with hadoop, zookeeper and accumulo. 
> 
> First, get zookeeper up and running, even if you have delete all its data.  
> 
> Next, attempt to determine the mapping of table names to tableIds.  You can do this in the shell when your accumulo instance is healthy.  If it isn't healthy, you will have to guess based on the data in the files in HDFS.
> 
> So, for example, the table "trace" is probably table id "1".  You can find the files for trace in /accumulo/tables/1.
> 
> Don't worry if you get the names wrong.  You can always rename the tables later. 
> 
> Move the old files for accumulo out of the way and re-initialize:
> 
> $ hadoop fs -mv /accumulo /accumulo-old
> $ ./bin/accumulo init
> $ ./bin/start-all.sh
> 
> Recreate your tables:
> 
> $ ./bin/accumulo shell -u root -p mysecret
> shell > createtable table1
> 
> Learn the new table id mapping:
> shell > tables -l
> !METADATA => !0
> trace => 1
> table1 => 2
> ...
> 
> Bulk import all your data back into the new table ids:
> Assuming you have determined that "table1" used to be table id "a" and is now "2",
> you do something like this:
> 
> $ hadoop fs -mkdir /tmp/failed
> $ ./bin/accumulo shell -u root -p mysecret
> shell > table table1
> shell table1 > importdirectory /accumulo-old/tables/a/default_tablet /tmp/failed true
> 
> There are lots of directories under every table id directory.  You will need to import each of them.  I suggest creating a script and passing it to the shell on the command line.
> 
> I know of instances in which trillions of entries were recovered and available in a matter of hours.
> 
> -Eric
> 
> 
> 
> On Wed, Mar 27, 2013 at 3:39 PM, Aji Janis <aj...@gmail.com> wrote:
> when you say " you can move the files aside in HDFS" .. which files are you referring to? I have never set up zookeeper myself so I am not aware of all the changes needed.
> 
> 
> 
> On Wed, Mar 27, 2013 at 3:33 PM, Eric Newton <er...@gmail.com> wrote:
> If you lose zookeeper, you can move the files aside in HDFS, recreate your instance in zookeeper and bulk import all of the old files.  It's not perfect: you lose table configurations, split points and user permissions, but you do preserve most of the data.
> 
> You can back up each of these bits of information periodically if you like.  Outside of the files in HDFS, the configuration information is pretty small.
> 
> -Eric
> 
> 
> 
> On Wed, Mar 27, 2013 at 3:18 PM, Aji Janis <aj...@gmail.com> wrote:
> Eric and Josh thanks for all your feedback. We ended up loosing all our accumulo data because I had to reformat hadoop. Here is in a nutshell what I did:
> 
> Stop accumulo 
> Stop hadoop
> On hadoop master and all datanodes, from dfs.data.dir (hdfs-site.xml) remove everything under the data folder
> On hadoop master, from dfs.name.dir (hdfs-site.xml) remove everything under the name folder
> As hadoop user, execute.../hadoop/bin/hadoop namenode -format
> As hadoop user, execute.../hadoop/bin/start-all.sh ==> should populate data/ and name/ dirs that was erased in steps 3, 4.
> Initialized Accumulo - as accumulo user,  ../accumulo/bin/accumulo init (I created a new instance)
> Start accumulo
> I was wondering if anyone had suggestions or thoughts on how I could have solved the original issue of accumulo waiting initialization without loosing my accumulo data? Is it possible to do so?
> 
> 
> 
>

Re: Waiting for accumulo to be initialized

Posted by Aji Janis <aj...@gmail.com>.

Eric, Really appreciate you jotting this down. Too late to try it out this
time but will give this a try (if, hopefully not) there is a next time to
be had.

Thanks again.



On Wed, Mar 27, 2013 at 4:19 PM, Eric Newton <er...@gmail.com> wrote:

> I should write this up in the user manual.  It's not that hard, but it's
> really not the first thing you want to tackle while learning how to use
> accumulo.  I just opened ACCUMULO-1217<https://issues.apache.org/jira/browse/ACCUMULO-1217> to
> do that.
>
> I wrote this from memory: expect errors.  Needless to say, you would only
> want to do this when you are more comfortable with hadoop, zookeeper and
> accumulo.
>
> First, get zookeeper up and running, even if you have delete all its data.
>
>
> Next, attempt to determine the mapping of table names to tableIds.  You
> can do this in the shell when your accumulo instance is healthy.  If it
> isn't healthy, you will have to guess based on the data in the files in
> HDFS.
>
> So, for example, the table "trace" is probably table id "1".  You can find
> the files for trace in /accumulo/tables/1.
>
> Don't worry if you get the names wrong.  You can always rename the tables
> later.
>
> Move the old files for accumulo out of the way and re-initialize:
>
> $ hadoop fs -mv /accumulo /accumulo-old
> $ ./bin/accumulo init
> $ ./bin/start-all.sh
>
> Recreate your tables:
>
> $ ./bin/accumulo shell -u root -p mysecret
> shell > createtable table1
>
> Learn the new table id mapping:
> shell > tables -l
> !METADATA => !0
> trace => 1
> table1 => 2
> ...
>
> Bulk import all your data back into the new table ids:
> Assuming you have determined that "table1" used to be table id "a" and is
> now "2",
> you do something like this:
>
> $ hadoop fs -mkdir /tmp/failed
> $ ./bin/accumulo shell -u root -p mysecret
> shell > table table1
> shell table1 > importdirectory /accumulo-old/tables/a/default_tablet
> /tmp/failed true
>
> There are lots of directories under every table id directory.  You will
> need to import each of them.  I suggest creating a script and passing it to
> the shell on the command line.
>
> I know of instances in which trillions of entries were recovered and
> available in a matter of hours.
>
> -Eric
>
>
>
> On Wed, Mar 27, 2013 at 3:39 PM, Aji Janis <aj...@gmail.com> wrote:
>
>> when you say " you can move the files aside in HDFS" .. which files are
>> you referring to? I have never set up zookeeper myself so I am not aware of
>> all the changes needed.
>>
>>
>>
>> On Wed, Mar 27, 2013 at 3:33 PM, Eric Newton <er...@gmail.com>wrote:
>>
>>> If you lose zookeeper, you can move the files aside in HDFS, recreate
>>> your instance in zookeeper and bulk import all of the old files.  It's not
>>> perfect: you lose table configurations, split points and user permissions,
>>> but you do preserve most of the data.
>>>
>>> You can back up each of these bits of information periodically if you
>>> like.  Outside of the files in HDFS, the configuration information is
>>> pretty small.
>>>
>>> -Eric
>>>
>>>
>>>
>>> On Wed, Mar 27, 2013 at 3:18 PM, Aji Janis <aj...@gmail.com> wrote:
>>>
>>>> Eric and Josh thanks for all your feedback. We ended up *loosing all
>>>> our accumulo data* because I had to reformat hadoop. Here is in a
>>>> nutshell what I did:
>>>>
>>>>
>>>>    1. Stop accumulo
>>>>    2. Stop hadoop
>>>>    3. On hadoop master and all datanodes, from dfs.data.dir
>>>>    (hdfs-site.xml) remove everything under the data folder
>>>>    4. On hadoop master, from dfs.name.dir (hdfs-site.xml) remove
>>>>    everything under the name folder
>>>>    5. As hadoop user, execute.../hadoop/bin/hadoop namenode -format
>>>>    6. As hadoop user, execute.../hadoop/bin/start-all.sh ==> should
>>>>    populate data/ and name/ dirs that was erased in steps 3, 4.
>>>>    7. Initialized Accumulo - as accumulo user,
>>>>     ../accumulo/bin/accumulo init (I created a new instance)
>>>>    8. Start accumulo
>>>>
>>>> I was wondering if anyone had suggestions or thoughts on how I could
>>>> have solved the original issue of accumulo waiting initialization without
>>>> loosing my accumulo data? Is it possible to do so?
>>>>
>>>
>>>
>>
>

Re: Waiting for accumulo to be initialized

Posted by Eric Newton <er...@gmail.com>.

I should write this up in the user manual.  It's not that hard, but it's
really not the first thing you want to tackle while learning how to use
accumulo.  I just opened
ACCUMULO-1217<https://issues.apache.org/jira/browse/ACCUMULO-1217> to
do that.

I wrote this from memory: expect errors.  Needless to say, you would only
want to do this when you are more comfortable with hadoop, zookeeper and
accumulo.

First, get zookeeper up and running, even if you have delete all its data.

Next, attempt to determine the mapping of table names to tableIds.  You can
do this in the shell when your accumulo instance is healthy.  If it isn't
healthy, you will have to guess based on the data in the files in HDFS.

So, for example, the table "trace" is probably table id "1".  You can find
the files for trace in /accumulo/tables/1.

Don't worry if you get the names wrong.  You can always rename the tables
later.

Move the old files for accumulo out of the way and re-initialize:

$ hadoop fs -mv /accumulo /accumulo-old
$ ./bin/accumulo init
$ ./bin/start-all.sh

Recreate your tables:

$ ./bin/accumulo shell -u root -p mysecret
shell > createtable table1

Learn the new table id mapping:
shell > tables -l
!METADATA => !0
trace => 1
table1 => 2
...

Bulk import all your data back into the new table ids:
Assuming you have determined that "table1" used to be table id "a" and is
now "2",
you do something like this:

$ hadoop fs -mkdir /tmp/failed
$ ./bin/accumulo shell -u root -p mysecret
shell > table table1
shell table1 > importdirectory /accumulo-old/tables/a/default_tablet
/tmp/failed true

There are lots of directories under every table id directory.  You will
need to import each of them.  I suggest creating a script and passing it to
the shell on the command line.

I know of instances in which trillions of entries were recovered and
available in a matter of hours.

-Eric

On Wed, Mar 27, 2013 at 3:39 PM, Aji Janis <aj...@gmail.com> wrote:

> when you say " you can move the files aside in HDFS" .. which files are
> you referring to? I have never set up zookeeper myself so I am not aware of
> all the changes needed.
>
>
>
> On Wed, Mar 27, 2013 at 3:33 PM, Eric Newton <er...@gmail.com>wrote:
>
>> If you lose zookeeper, you can move the files aside in HDFS, recreate
>> your instance in zookeeper and bulk import all of the old files.  It's not
>> perfect: you lose table configurations, split points and user permissions,
>> but you do preserve most of the data.
>>
>> You can back up each of these bits of information periodically if you
>> like.  Outside of the files in HDFS, the configuration information is
>> pretty small.
>>
>> -Eric
>>
>>
>>
>> On Wed, Mar 27, 2013 at 3:18 PM, Aji Janis <aj...@gmail.com> wrote:
>>
>>> Eric and Josh thanks for all your feedback. We ended up *loosing all
>>> our accumulo data* because I had to reformat hadoop. Here is in a
>>> nutshell what I did:
>>>
>>>
>>>    1. Stop accumulo
>>>    2. Stop hadoop
>>>    3. On hadoop master and all datanodes, from dfs.data.dir
>>>    (hdfs-site.xml) remove everything under the data folder
>>>    4. On hadoop master, from dfs.name.dir (hdfs-site.xml) remove
>>>    everything under the name folder
>>>    5. As hadoop user, execute.../hadoop/bin/hadoop namenode -format
>>>    6. As hadoop user, execute.../hadoop/bin/start-all.sh ==> should
>>>    populate data/ and name/ dirs that was erased in steps 3, 4.
>>>    7. Initialized Accumulo - as accumulo user,
>>>     ../accumulo/bin/accumulo init (I created a new instance)
>>>    8. Start accumulo
>>>
>>> I was wondering if anyone had suggestions or thoughts on how I could
>>> have solved the original issue of accumulo waiting initialization without
>>> loosing my accumulo data? Is it possible to do so?
>>>
>>
>>
>

Re: Waiting for accumulo to be initialized

Posted by Aji Janis <aj...@gmail.com>.

when you say " you can move the files aside in HDFS" .. which files are you
referring to? I have never set up zookeeper myself so I am not aware of all
the changes needed.



On Wed, Mar 27, 2013 at 3:33 PM, Eric Newton <er...@gmail.com> wrote:

> If you lose zookeeper, you can move the files aside in HDFS, recreate your
> instance in zookeeper and bulk import all of the old files.  It's not
> perfect: you lose table configurations, split points and user permissions,
> but you do preserve most of the data.
>
> You can back up each of these bits of information periodically if you
> like.  Outside of the files in HDFS, the configuration information is
> pretty small.
>
> -Eric
>
>
>
> On Wed, Mar 27, 2013 at 3:18 PM, Aji Janis <aj...@gmail.com> wrote:
>
>> Eric and Josh thanks for all your feedback. We ended up *loosing all our
>> accumulo data* because I had to reformat hadoop. Here is in a nutshell
>> what I did:
>>
>>
>>    1. Stop accumulo
>>    2. Stop hadoop
>>    3. On hadoop master and all datanodes, from dfs.data.dir
>>    (hdfs-site.xml) remove everything under the data folder
>>    4. On hadoop master, from dfs.name.dir (hdfs-site.xml) remove
>>    everything under the name folder
>>    5. As hadoop user, execute.../hadoop/bin/hadoop namenode -format
>>    6. As hadoop user, execute.../hadoop/bin/start-all.sh ==> should
>>    populate data/ and name/ dirs that was erased in steps 3, 4.
>>    7. Initialized Accumulo - as accumulo user,  ../accumulo/bin/accumulo
>>    init (I created a new instance)
>>    8. Start accumulo
>>
>> I was wondering if anyone had suggestions or thoughts on how I could have
>> solved the original issue of accumulo waiting initialization without
>> loosing my accumulo data? Is it possible to do so?
>>
>
>

Re: Waiting for accumulo to be initialized

Posted by Eric Newton <er...@gmail.com>.

If you lose zookeeper, you can move the files aside in HDFS, recreate your
instance in zookeeper and bulk import all of the old files.  It's not
perfect: you lose table configurations, split points and user permissions,
but you do preserve most of the data.

You can back up each of these bits of information periodically if you like.
 Outside of the files in HDFS, the configuration information is pretty
small.

-Eric

On Wed, Mar 27, 2013 at 3:18 PM, Aji Janis <aj...@gmail.com> wrote:

> Eric and Josh thanks for all your feedback. We ended up *loosing all our
> accumulo data* because I had to reformat hadoop. Here is in a nutshell
> what I did:
>
>
>    1. Stop accumulo
>    2. Stop hadoop
>    3. On hadoop master and all datanodes, from dfs.data.dir
>    (hdfs-site.xml) remove everything under the data folder
>    4. On hadoop master, from dfs.name.dir (hdfs-site.xml) remove
>    everything under the name folder
>    5. As hadoop user, execute.../hadoop/bin/hadoop namenode -format
>    6. As hadoop user, execute.../hadoop/bin/start-all.sh ==> should
>    populate data/ and name/ dirs that was erased in steps 3, 4.
>    7. Initialized Accumulo - as accumulo user,  ../accumulo/bin/accumulo
>    init (I created a new instance)
>    8. Start accumulo
>
> I was wondering if anyone had suggestions or thoughts on how I could have
> solved the original issue of accumulo waiting initialization without
> loosing my accumulo data? Is it possible to do so?
>

Re: Waiting for accumulo to be initialized

Posted by Aji Janis <aj...@gmail.com>.

Well, it was a test data so saving it wasn't high priority but a 'nice to
have'. Prior to asking the question here, I checked out this blog
http://apache-accumulo.1065345.n5.nabble.com/Re-init-Accumulo-over-existing-installation-td345.html
.
So I knew data would be lost.

The reason I ask about saving data is because we are not quite sure why
zookeeper got hosed in the first place and if this issue happened in Prod I
like to have some suggestions handy for saving data.

Responding inline...
Zookeeper crashing (or even `kill -9`ing) should have no effect on Hadoop.
Did Hadoop come up correctly before you tried to restart Accumulo?
-- Yes.

Did you then do the `hadoop namenode -format` and expect to keep your data?
If so, lesson learned?
-- Prior to trying hadoop reformat (since I knew it be destructive) I tried
zookeeper to stop cleanly - hoping that might clean up something - clearly
not. I am fairly new to this so definitely lesson learned.



On Wed, Mar 27, 2013 at 3:40 PM, Josh Elser <jo...@gmail.com> wrote:

>  First off, I'm sorry about you losing data. I thought you recognized that
> this would be destructive on your data reading that link you sent out. I
> wasn't really advising you from a "saving data" standpoint.
>
> Zookeeper crashing (or even `kill -9`ing) should have no effect on Hadoop.
> Did Hadoop come up correctly before you tried to restart Accumulo? Did you
> then do the `hadoop namenode -format` and expect to keep your data? If so,
> lesson learned?
>
>
>
> On 3/27/13 3:18 PM, Aji Janis wrote:
>
> Eric and Josh thanks for all your feedback. We ended up *loosing all our
> accumulo data* because I had to reformat hadoop. Here is in a nutshell
> what I did:
>
>
>    1. Stop accumulo
>    2. Stop hadoop
>    3. On hadoop master and all datanodes, from dfs.data.dir
>    (hdfs-site.xml) remove everything under the data folder
>    4. On hadoop master, from dfs.name.dir (hdfs-site.xml) remove
>    everything under the name folder
>    5. As hadoop user, execute.../hadoop/bin/hadoop namenode -format
>    6. As hadoop user, execute.../hadoop/bin/start-all.sh ==> should
>    populate data/ and name/ dirs that was erased in steps 3, 4.
>    7. Initialized Accumulo - as accumulo user,  ../accumulo/bin/accumulo
>    init (I created a new instance)
>    8. Start accumulo
>
>  I was wondering if anyone had suggestions or thoughts on how I could
> have solved the original issue of accumulo waiting initialization without
> loosing my accumulo data? Is it possible to do so?
>
>
>

Re: Waiting for accumulo to be initialized

Posted by Josh Elser <jo...@gmail.com>.

First off, I'm sorry about you losing data. I thought you recognized 
that this would be destructive on your data reading that link you sent 
out. I wasn't really advising you from a "saving data" standpoint.

Zookeeper crashing (or even `kill -9`ing) should have no effect on 
Hadoop. Did Hadoop come up correctly before you tried to restart 
Accumulo? Did you then do the `hadoop namenode -format` and expect to 
keep your data? If so, lesson learned?

On 3/27/13 3:18 PM, Aji Janis wrote:
> Eric and Josh thanks for all your feedback. We ended up _loosing all 
> our accumulo data_ because I had to reformat hadoop. Here is in a 
> nutshell what I did:
>
>  1. Stop accumulo
>  2. Stop hadoop
>  3. On hadoop master and all datanodes, from dfs.data.dir
>     (hdfs-site.xml) remove everything under the data folder
>  4. On hadoop master, from dfs.name.dir (hdfs-site.xml) remove
>     everything under the name folder
>  5. As hadoop user, execute.../hadoop/bin/hadoop namenode -format
>  6. As hadoop user, execute.../hadoop/bin/start-all.sh ==> should
>     populate data/ and name/ dirs that was erased in steps 3, 4.
>  7. Initialized Accumulo - as accumulo user,  ../accumulo/bin/accumulo
>     init (I created a new instance)
>  8. Start accumulo
>
> I was wondering if anyone had suggestions or thoughts on how I could 
> have solved the original issue of accumulo waiting initialization 
> without loosing my accumulo data? Is it possible to do so?

Re: Waiting for accumulo to be initialized

Posted by Aji Janis <aj...@gmail.com>.

Eric and Josh thanks for all your feedback. We ended up *loosing all our
accumulo data* because I had to reformat hadoop. Here is in a nutshell what
I did:


   1. Stop accumulo
   2. Stop hadoop
   3. On hadoop master and all datanodes, from dfs.data.dir (hdfs-site.xml)
   remove everything under the data folder
   4. On hadoop master, from dfs.name.dir (hdfs-site.xml) remove everything
   under the name folder
   5. As hadoop user, execute.../hadoop/bin/hadoop namenode -format
   6. As hadoop user, execute.../hadoop/bin/start-all.sh ==> should
   populate data/ and name/ dirs that was erased in steps 3, 4.
   7. Initialized Accumulo - as accumulo user,  ../accumulo/bin/accumulo
   init (I created a new instance)
   8. Start accumulo

I was wondering if anyone had suggestions or thoughts on how I could have
solved the original issue of accumulo waiting initialization without
loosing my accumulo data? Is it possible to do so?

Re: Waiting for accumulo to be initialized

Posted by Josh Elser <jo...@gmail.com>.

Just remove the directories configured for dfs.name.dir and dfs.data.dir 
and run the `hadoop namenode -format` again.

On 3/27/13 11:31 AM, Aji Janis wrote:
> well... I found this in the datanode log
>
>  ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: 
> java.io.IOException: Incompatible namespaceIDs in 
> /opt/hadoop-data/hadoop/hdfs/data: namenode namespaceID = 2089335599; 
> datanode namespaceID = 1868050007
>
>
>
>
> On Wed, Mar 27, 2013 at 11:23 AM, Eric Newton <eric.newton@gmail.com 
> <ma...@gmail.com>> wrote:
>
>     "0 live nodes"  that will continue to be a problem.
>
>     Check the datanode logs.
>
>     -Eric
>
>
>     On Wed, Mar 27, 2013 at 11:20 AM, Aji Janis <aji1705@gmail.com
>     <ma...@gmail.com>> wrote:
>
>
>         I removed everything
>         under /opt/hadoop-data/hadoop/hdfs/data/current/ because it
>         seemed like old files were hanging around and I had to remove
>         them before I can start re-initialization.
>
>
>         I didn't move anything to /tmp or try reboot.
>         my old accumulo instance had everything under /accumulo (in
>         hdfs) and its still there but i m guessing me deleting stuff
>         from hadoop-data has deleted a bunch of its stuff.
>
>         i tried to restart zookeeper and hadoop and it came up fine
>         but now my namenode url says there 0 live nodes (instead of 5
>         in my cluster). Doing a ps -ef | grep hadoop on each node in
>         cluster however shows that hadoop is running.... so i am not
>         sure what I messed up. Suggestions?
>
>         Have I lost accumulo for good? Should I just recreate the
>         instance?
>
>
>         On Wed, Mar 27, 2013 at 10:52 AM, Eric Newton
>         <eric.newton@gmail.com <ma...@gmail.com>> wrote:
>
>             Your DataNode has not started and reported blocks to the
>             NameNode.
>
>             Did you store things (zookeeper, hadoop) in /tmp and
>             reboot?  It's a common thing to do, and it commonly
>             deletes everything in /tmp.  If that's the case, you will
>             need to shutdown hdfs and run:
>
>             $ hadoop namenode -format
>
>             And then start hdfs again.
>
>             -Eric
>
>
>             On Wed, Mar 27, 2013 at 10:47 AM, Aji Janis
>             <aji1705@gmail.com <ma...@gmail.com>> wrote:
>
>                 I see thank you. When I bring up hdfs (start-all from
>                 node with jobtracker) I see the following message on
>                 url: http://mynode:50070/dfshealth.jsp
>
>                 *Safe mode is ON. /The ratio of reported blocks 0.0000
>                 has not reached the threshold 0.9990. Safe mode will
>                 be turned off automatically./
>                 **2352 files and directories, 2179 blocks = 4531
>                 total. Heap Size is 54 MB / 888.94 MB (6%) *
>                 *
>                 *
>                 Whats going on here?
>
>
>
>                 On Wed, Mar 27, 2013 at 10:44 AM, Eric Newton
>                 <eric.newton@gmail.com <ma...@gmail.com>>
>                 wrote:
>
>                     This will (eventually) delete everything created
>                     by accumulo in hfds:
>
>                     $ hadoop fs -rmr /accumulo
>
>                     Accumulo will create a new area to hold your
>                     configurations.  Accumulo will basically abandon
>                     that old configuration.  There's a class that can
>                     be used to clean up old accumulo instances in
>                     zookeeper:
>
>                     $ ./bin/accumulo
>                     org.apache.accumulo.server.util.CleanZookeeper
>                     hostname:port
>
>                     Where "hostname:port" is the name of one of your
>                     zookeeper hosts.
>
>                     -Eric
>
>
>
>                     On Wed, Mar 27, 2013 at 10:29 AM, Aji Janis
>                     <aji1705@gmail.com <ma...@gmail.com>> wrote:
>
>                         Thanks Eric. But shouldn't I be cleaning up
>                         something in the hadoop-data directory too?
>                         and Zookeeper?
>
>
>
>                         On Wed, Mar 27, 2013 at 10:27 AM, Eric Newton
>                         <eric.newton@gmail.com
>                         <ma...@gmail.com>> wrote:
>
>                             To re-initialize accumulo, bring up
>                             zookeeper and hdfs.
>
>                             $ hadoop fs -rmr /accumulo
>                             $ ./bin/accumulo init
>
>                             I do this about 100 times a day on my dev
>                             box. :-)
>
>                             -Eric
>
>
>                             On Wed, Mar 27, 2013 at 10:10 AM, Aji
>                             Janis <aji1705@gmail.com
>                             <ma...@gmail.com>> wrote:
>
>                                 Hello,
>
>                                 We have the following set up:
>
>                                 zookeeper - 3.3.3-1073969
>                                 hadoop - 0.20.203.0
>                                 accumulo - 1.4.2
>
>                                 Our zookeeper crashed for some reason.
>                                 I tried to doing a clean stop of
>                                 everything and then brought up (in
>                                 order) zookeeper and hadoop (cluster).
>                                 But when trying to do a start-all on
>                                 accumulo I get the following message
>                                 gets infinitely printed to the screen:
>
>                                 “26 12:45:43,551 [server.Accumulo]
>                                 INFO : Waiting for accumulo to be
>                                 initialized”
>
>
>
>                                 Doing some digging on the web it seems
>                                 that accumulo is hosed and needs some
>                                 re-intialization. It also appears that
>                                 may be I need to clean out things from
>                                 zookeeper and hadoop prior to a
>                                 re-initialization. Has any one done
>                                 this before? Can someone please
>                                 provide me some directions on what to
>                                 do (or not to do)? Really appreciate
>                                 help on this. Thanks.
>
>
>
>
>
>
>
>
>

Re: Waiting for accumulo to be initialized

Posted by Aji Janis <aj...@gmail.com>.

well... I found this in the datanode log

 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode:
java.io.IOException: Incompatible namespaceIDs in
/opt/hadoop-data/hadoop/hdfs/data: namenode namespaceID = 2089335599;
datanode namespaceID = 1868050007




On Wed, Mar 27, 2013 at 11:23 AM, Eric Newton <er...@gmail.com> wrote:

> "0 live nodes"  that will continue to be a problem.
>
> Check the datanode logs.
>
> -Eric
>
>
> On Wed, Mar 27, 2013 at 11:20 AM, Aji Janis <aj...@gmail.com> wrote:
>
>>
>> I removed everything under /opt/hadoop-data/hadoop/hdfs/data/current/
>> because it seemed like old files were hanging around and I had to remove
>> them before I can start re-initialization.
>>
>>
>> I didn't move anything to /tmp or try reboot.
>> my old accumulo instance had everything under /accumulo (in hdfs) and its
>> still there but i m guessing me deleting stuff from hadoop-data has deleted
>> a bunch of its stuff.
>>
>> i tried to restart zookeeper and hadoop and it came up fine but now my
>> namenode url says there 0 live nodes (instead of 5 in my cluster). Doing a
>> ps -ef | grep hadoop on each node in cluster however shows that hadoop is
>> running.... so i am not sure what I messed up. Suggestions?
>>
>> Have I lost accumulo for good? Should I just recreate the instance?
>>
>>
>> On Wed, Mar 27, 2013 at 10:52 AM, Eric Newton <er...@gmail.com>wrote:
>>
>>> Your DataNode has not started and reported blocks to the NameNode.
>>>
>>> Did you store things (zookeeper, hadoop) in /tmp and reboot?  It's a
>>> common thing to do, and it commonly deletes everything in /tmp.  If that's
>>> the case, you will need to shutdown hdfs and run:
>>>
>>> $ hadoop namenode -format
>>>
>>> And then start hdfs again.
>>>
>>> -Eric
>>>
>>>
>>> On Wed, Mar 27, 2013 at 10:47 AM, Aji Janis <aj...@gmail.com> wrote:
>>>
>>>> I see thank you. When I bring up hdfs (start-all from node with
>>>> jobtracker) I see the following message on url:
>>>> http://mynode:50070/dfshealth.jsp
>>>>
>>>> *Safe mode is ON. The ratio of reported blocks 0.0000 has not reached
>>>> the threshold 0.9990. Safe mode will be turned off automatically.
>>>> **2352 files and directories, 2179 blocks = 4531 total. Heap Size is
>>>> 54 MB / 888.94 MB (6%) *
>>>> *
>>>> *
>>>> Whats going on here?
>>>>
>>>>
>>>>
>>>> On Wed, Mar 27, 2013 at 10:44 AM, Eric Newton <er...@gmail.com>wrote:
>>>>
>>>>> This will (eventually) delete everything created by accumulo in hfds:
>>>>>
>>>>> $ hadoop fs -rmr /accumulo
>>>>>
>>>>> Accumulo will create a new area to hold your configurations.  Accumulo
>>>>> will basically abandon that old configuration.  There's a class that can be
>>>>> used to clean up old accumulo instances in zookeeper:
>>>>>
>>>>> $ ./bin/accumulo org.apache.accumulo.server.util.CleanZookeeper
>>>>> hostname:port
>>>>>
>>>>> Where "hostname:port" is the name of one of your zookeeper hosts.
>>>>>
>>>>> -Eric
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Mar 27, 2013 at 10:29 AM, Aji Janis <aj...@gmail.com> wrote:
>>>>>
>>>>>> Thanks Eric. But shouldn't I be cleaning up something in the
>>>>>> hadoop-data directory too? and Zookeeper?
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Mar 27, 2013 at 10:27 AM, Eric Newton <er...@gmail.com>wrote:
>>>>>>
>>>>>>> To re-initialize accumulo, bring up zookeeper and hdfs.
>>>>>>>
>>>>>>> $ hadoop fs -rmr /accumulo
>>>>>>> $ ./bin/accumulo init
>>>>>>>
>>>>>>> I do this about 100 times a day on my dev box. :-)
>>>>>>>
>>>>>>> -Eric
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Mar 27, 2013 at 10:10 AM, Aji Janis <aj...@gmail.com>wrote:
>>>>>>>
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> We have the following set up:
>>>>>>>>
>>>>>>>> zookeeper - 3.3.3-1073969
>>>>>>>> hadoop - 0.20.203.0
>>>>>>>> accumulo - 1.4.2
>>>>>>>>
>>>>>>>> Our zookeeper crashed for some reason. I tried to doing a clean
>>>>>>>> stop of everything and then brought up (in order) zookeeper and hadoop
>>>>>>>> (cluster). But when trying to do a start-all on accumulo I get the
>>>>>>>> following message gets infinitely printed to the screen:
>>>>>>>>
>>>>>>>> “26 12:45:43,551 [server.Accumulo] INFO : Waiting for accumulo to
>>>>>>>> be initialized”
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Doing some digging on the web it seems that accumulo is hosed and
>>>>>>>> needs some re-intialization. It also appears that may be I need to clean
>>>>>>>> out things from zookeeper and hadoop prior to a re-initialization. Has any
>>>>>>>> one done this before? Can someone please provide me some directions on what
>>>>>>>> to do (or not to do)? Really appreciate help on this. Thanks.
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Waiting for accumulo to be initialized

Posted by Eric Newton <er...@gmail.com>.

"0 live nodes"  that will continue to be a problem.

Check the datanode logs.

-Eric


On Wed, Mar 27, 2013 at 11:20 AM, Aji Janis <aj...@gmail.com> wrote:

>
> I removed everything under /opt/hadoop-data/hadoop/hdfs/data/current/
> because it seemed like old files were hanging around and I had to remove
> them before I can start re-initialization.
>
>
> I didn't move anything to /tmp or try reboot.
> my old accumulo instance had everything under /accumulo (in hdfs) and its
> still there but i m guessing me deleting stuff from hadoop-data has deleted
> a bunch of its stuff.
>
> i tried to restart zookeeper and hadoop and it came up fine but now my
> namenode url says there 0 live nodes (instead of 5 in my cluster). Doing a
> ps -ef | grep hadoop on each node in cluster however shows that hadoop is
> running.... so i am not sure what I messed up. Suggestions?
>
> Have I lost accumulo for good? Should I just recreate the instance?
>
>
> On Wed, Mar 27, 2013 at 10:52 AM, Eric Newton <er...@gmail.com>wrote:
>
>> Your DataNode has not started and reported blocks to the NameNode.
>>
>> Did you store things (zookeeper, hadoop) in /tmp and reboot?  It's a
>> common thing to do, and it commonly deletes everything in /tmp.  If that's
>> the case, you will need to shutdown hdfs and run:
>>
>> $ hadoop namenode -format
>>
>> And then start hdfs again.
>>
>> -Eric
>>
>>
>> On Wed, Mar 27, 2013 at 10:47 AM, Aji Janis <aj...@gmail.com> wrote:
>>
>>> I see thank you. When I bring up hdfs (start-all from node with
>>> jobtracker) I see the following message on url:
>>> http://mynode:50070/dfshealth.jsp
>>>
>>> *Safe mode is ON. The ratio of reported blocks 0.0000 has not reached
>>> the threshold 0.9990. Safe mode will be turned off automatically.
>>> **2352 files and directories, 2179 blocks = 4531 total. Heap Size is 54
>>> MB / 888.94 MB (6%) *
>>> *
>>> *
>>> Whats going on here?
>>>
>>>
>>>
>>> On Wed, Mar 27, 2013 at 10:44 AM, Eric Newton <er...@gmail.com>wrote:
>>>
>>>> This will (eventually) delete everything created by accumulo in hfds:
>>>>
>>>> $ hadoop fs -rmr /accumulo
>>>>
>>>> Accumulo will create a new area to hold your configurations.  Accumulo
>>>> will basically abandon that old configuration.  There's a class that can be
>>>> used to clean up old accumulo instances in zookeeper:
>>>>
>>>> $ ./bin/accumulo org.apache.accumulo.server.util.CleanZookeeper
>>>> hostname:port
>>>>
>>>> Where "hostname:port" is the name of one of your zookeeper hosts.
>>>>
>>>> -Eric
>>>>
>>>>
>>>>
>>>> On Wed, Mar 27, 2013 at 10:29 AM, Aji Janis <aj...@gmail.com> wrote:
>>>>
>>>>> Thanks Eric. But shouldn't I be cleaning up something in the
>>>>> hadoop-data directory too? and Zookeeper?
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Mar 27, 2013 at 10:27 AM, Eric Newton <er...@gmail.com>wrote:
>>>>>
>>>>>> To re-initialize accumulo, bring up zookeeper and hdfs.
>>>>>>
>>>>>> $ hadoop fs -rmr /accumulo
>>>>>> $ ./bin/accumulo init
>>>>>>
>>>>>> I do this about 100 times a day on my dev box. :-)
>>>>>>
>>>>>> -Eric
>>>>>>
>>>>>>
>>>>>> On Wed, Mar 27, 2013 at 10:10 AM, Aji Janis <aj...@gmail.com>wrote:
>>>>>>
>>>>>>> Hello,
>>>>>>>
>>>>>>> We have the following set up:
>>>>>>>
>>>>>>> zookeeper - 3.3.3-1073969
>>>>>>> hadoop - 0.20.203.0
>>>>>>> accumulo - 1.4.2
>>>>>>>
>>>>>>> Our zookeeper crashed for some reason. I tried to doing a clean stop
>>>>>>> of everything and then brought up (in order) zookeeper and hadoop
>>>>>>> (cluster). But when trying to do a start-all on accumulo I get the
>>>>>>> following message gets infinitely printed to the screen:
>>>>>>>
>>>>>>> “26 12:45:43,551 [server.Accumulo] INFO : Waiting for accumulo to be
>>>>>>> initialized”
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Doing some digging on the web it seems that accumulo is hosed and
>>>>>>> needs some re-intialization. It also appears that may be I need to clean
>>>>>>> out things from zookeeper and hadoop prior to a re-initialization. Has any
>>>>>>> one done this before? Can someone please provide me some directions on what
>>>>>>> to do (or not to do)? Really appreciate help on this. Thanks.
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Waiting for accumulo to be initialized

Posted by Aji Janis <aj...@gmail.com>.

I removed everything under /opt/hadoop-data/hadoop/hdfs/data/current/
because it seemed like old files were hanging around and I had to remove
them before I can start re-initialization.


I didn't move anything to /tmp or try reboot.
my old accumulo instance had everything under /accumulo (in hdfs) and its
still there but i m guessing me deleting stuff from hadoop-data has deleted
a bunch of its stuff.

i tried to restart zookeeper and hadoop and it came up fine but now my
namenode url says there 0 live nodes (instead of 5 in my cluster). Doing a
ps -ef | grep hadoop on each node in cluster however shows that hadoop is
running.... so i am not sure what I messed up. Suggestions?

Have I lost accumulo for good? Should I just recreate the instance?


On Wed, Mar 27, 2013 at 10:52 AM, Eric Newton <er...@gmail.com> wrote:

> Your DataNode has not started and reported blocks to the NameNode.
>
> Did you store things (zookeeper, hadoop) in /tmp and reboot?  It's a
> common thing to do, and it commonly deletes everything in /tmp.  If that's
> the case, you will need to shutdown hdfs and run:
>
> $ hadoop namenode -format
>
> And then start hdfs again.
>
> -Eric
>
>
> On Wed, Mar 27, 2013 at 10:47 AM, Aji Janis <aj...@gmail.com> wrote:
>
>> I see thank you. When I bring up hdfs (start-all from node with
>> jobtracker) I see the following message on url:
>> http://mynode:50070/dfshealth.jsp
>>
>> *Safe mode is ON. The ratio of reported blocks 0.0000 has not reached
>> the threshold 0.9990. Safe mode will be turned off automatically.
>> **2352 files and directories, 2179 blocks = 4531 total. Heap Size is 54
>> MB / 888.94 MB (6%) *
>> *
>> *
>> Whats going on here?
>>
>>
>>
>> On Wed, Mar 27, 2013 at 10:44 AM, Eric Newton <er...@gmail.com>wrote:
>>
>>> This will (eventually) delete everything created by accumulo in hfds:
>>>
>>> $ hadoop fs -rmr /accumulo
>>>
>>> Accumulo will create a new area to hold your configurations.  Accumulo
>>> will basically abandon that old configuration.  There's a class that can be
>>> used to clean up old accumulo instances in zookeeper:
>>>
>>> $ ./bin/accumulo org.apache.accumulo.server.util.CleanZookeeper
>>> hostname:port
>>>
>>> Where "hostname:port" is the name of one of your zookeeper hosts.
>>>
>>> -Eric
>>>
>>>
>>>
>>> On Wed, Mar 27, 2013 at 10:29 AM, Aji Janis <aj...@gmail.com> wrote:
>>>
>>>> Thanks Eric. But shouldn't I be cleaning up something in the
>>>> hadoop-data directory too? and Zookeeper?
>>>>
>>>>
>>>>
>>>> On Wed, Mar 27, 2013 at 10:27 AM, Eric Newton <er...@gmail.com>wrote:
>>>>
>>>>> To re-initialize accumulo, bring up zookeeper and hdfs.
>>>>>
>>>>> $ hadoop fs -rmr /accumulo
>>>>> $ ./bin/accumulo init
>>>>>
>>>>> I do this about 100 times a day on my dev box. :-)
>>>>>
>>>>> -Eric
>>>>>
>>>>>
>>>>> On Wed, Mar 27, 2013 at 10:10 AM, Aji Janis <aj...@gmail.com> wrote:
>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> We have the following set up:
>>>>>>
>>>>>> zookeeper - 3.3.3-1073969
>>>>>> hadoop - 0.20.203.0
>>>>>> accumulo - 1.4.2
>>>>>>
>>>>>> Our zookeeper crashed for some reason. I tried to doing a clean stop
>>>>>> of everything and then brought up (in order) zookeeper and hadoop
>>>>>> (cluster). But when trying to do a start-all on accumulo I get the
>>>>>> following message gets infinitely printed to the screen:
>>>>>>
>>>>>> “26 12:45:43,551 [server.Accumulo] INFO : Waiting for accumulo to be
>>>>>> initialized”
>>>>>>
>>>>>>
>>>>>>
>>>>>> Doing some digging on the web it seems that accumulo is hosed and
>>>>>> needs some re-intialization. It also appears that may be I need to clean
>>>>>> out things from zookeeper and hadoop prior to a re-initialization. Has any
>>>>>> one done this before? Can someone please provide me some directions on what
>>>>>> to do (or not to do)? Really appreciate help on this. Thanks.
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Waiting for accumulo to be initialized

Posted by Eric Newton <er...@gmail.com>.

Your DataNode has not started and reported blocks to the NameNode.

Did you store things (zookeeper, hadoop) in /tmp and reboot?  It's a common
thing to do, and it commonly deletes everything in /tmp.  If that's the
case, you will need to shutdown hdfs and run:

$ hadoop namenode -format

And then start hdfs again.

-Eric


On Wed, Mar 27, 2013 at 10:47 AM, Aji Janis <aj...@gmail.com> wrote:

> I see thank you. When I bring up hdfs (start-all from node with
> jobtracker) I see the following message on url:
> http://mynode:50070/dfshealth.jsp
>
> *Safe mode is ON. The ratio of reported blocks 0.0000 has not reached the
> threshold 0.9990. Safe mode will be turned off automatically.
> **2352 files and directories, 2179 blocks = 4531 total. Heap Size is 54
> MB / 888.94 MB (6%) *
> *
> *
> Whats going on here?
>
>
>
> On Wed, Mar 27, 2013 at 10:44 AM, Eric Newton <er...@gmail.com>wrote:
>
>> This will (eventually) delete everything created by accumulo in hfds:
>>
>> $ hadoop fs -rmr /accumulo
>>
>> Accumulo will create a new area to hold your configurations.  Accumulo
>> will basically abandon that old configuration.  There's a class that can be
>> used to clean up old accumulo instances in zookeeper:
>>
>> $ ./bin/accumulo org.apache.accumulo.server.util.CleanZookeeper
>> hostname:port
>>
>> Where "hostname:port" is the name of one of your zookeeper hosts.
>>
>> -Eric
>>
>>
>>
>> On Wed, Mar 27, 2013 at 10:29 AM, Aji Janis <aj...@gmail.com> wrote:
>>
>>> Thanks Eric. But shouldn't I be cleaning up something in the hadoop-data
>>> directory too? and Zookeeper?
>>>
>>>
>>>
>>> On Wed, Mar 27, 2013 at 10:27 AM, Eric Newton <er...@gmail.com>wrote:
>>>
>>>> To re-initialize accumulo, bring up zookeeper and hdfs.
>>>>
>>>> $ hadoop fs -rmr /accumulo
>>>> $ ./bin/accumulo init
>>>>
>>>> I do this about 100 times a day on my dev box. :-)
>>>>
>>>> -Eric
>>>>
>>>>
>>>> On Wed, Mar 27, 2013 at 10:10 AM, Aji Janis <aj...@gmail.com> wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> We have the following set up:
>>>>>
>>>>> zookeeper - 3.3.3-1073969
>>>>> hadoop - 0.20.203.0
>>>>> accumulo - 1.4.2
>>>>>
>>>>> Our zookeeper crashed for some reason. I tried to doing a clean stop
>>>>> of everything and then brought up (in order) zookeeper and hadoop
>>>>> (cluster). But when trying to do a start-all on accumulo I get the
>>>>> following message gets infinitely printed to the screen:
>>>>>
>>>>> “26 12:45:43,551 [server.Accumulo] INFO : Waiting for accumulo to be
>>>>> initialized”
>>>>>
>>>>>
>>>>>
>>>>> Doing some digging on the web it seems that accumulo is hosed and
>>>>> needs some re-intialization. It also appears that may be I need to clean
>>>>> out things from zookeeper and hadoop prior to a re-initialization. Has any
>>>>> one done this before? Can someone please provide me some directions on what
>>>>> to do (or not to do)? Really appreciate help on this. Thanks.
>>>>>
>>>>
>>>>
>>>
>>
>

Re: Waiting for accumulo to be initialized

Posted by Aji Janis <aj...@gmail.com>.

I see thank you. When I bring up hdfs (start-all from node with jobtracker)
I see the following message on url: http://mynode:50070/dfshealth.jsp

*Safe mode is ON. The ratio of reported blocks 0.0000 has not reached the
threshold 0.9990. Safe mode will be turned off automatically.
**2352 files and directories, 2179 blocks = 4531 total. Heap Size is 54 MB
/ 888.94 MB (6%) *
*
*
Whats going on here?



On Wed, Mar 27, 2013 at 10:44 AM, Eric Newton <er...@gmail.com> wrote:

> This will (eventually) delete everything created by accumulo in hfds:
>
> $ hadoop fs -rmr /accumulo
>
> Accumulo will create a new area to hold your configurations.  Accumulo
> will basically abandon that old configuration.  There's a class that can be
> used to clean up old accumulo instances in zookeeper:
>
> $ ./bin/accumulo org.apache.accumulo.server.util.CleanZookeeper
> hostname:port
>
> Where "hostname:port" is the name of one of your zookeeper hosts.
>
> -Eric
>
>
>
> On Wed, Mar 27, 2013 at 10:29 AM, Aji Janis <aj...@gmail.com> wrote:
>
>> Thanks Eric. But shouldn't I be cleaning up something in the hadoop-data
>> directory too? and Zookeeper?
>>
>>
>>
>> On Wed, Mar 27, 2013 at 10:27 AM, Eric Newton <er...@gmail.com>wrote:
>>
>>> To re-initialize accumulo, bring up zookeeper and hdfs.
>>>
>>> $ hadoop fs -rmr /accumulo
>>> $ ./bin/accumulo init
>>>
>>> I do this about 100 times a day on my dev box. :-)
>>>
>>> -Eric
>>>
>>>
>>> On Wed, Mar 27, 2013 at 10:10 AM, Aji Janis <aj...@gmail.com> wrote:
>>>
>>>> Hello,
>>>>
>>>> We have the following set up:
>>>>
>>>> zookeeper - 3.3.3-1073969
>>>> hadoop - 0.20.203.0
>>>> accumulo - 1.4.2
>>>>
>>>> Our zookeeper crashed for some reason. I tried to doing a clean stop of
>>>> everything and then brought up (in order) zookeeper and hadoop (cluster).
>>>> But when trying to do a start-all on accumulo I get the following message
>>>> gets infinitely printed to the screen:
>>>>
>>>> “26 12:45:43,551 [server.Accumulo] INFO : Waiting for accumulo to be
>>>> initialized”
>>>>
>>>>
>>>>
>>>> Doing some digging on the web it seems that accumulo is hosed and needs
>>>> some re-intialization. It also appears that may be I need to clean out
>>>> things from zookeeper and hadoop prior to a re-initialization. Has any one
>>>> done this before? Can someone please provide me some directions on what to
>>>> do (or not to do)? Really appreciate help on this. Thanks.
>>>>
>>>
>>>
>>
>

Re: Waiting for accumulo to be initialized

Posted by Eric Newton <er...@gmail.com>.

This will (eventually) delete everything created by accumulo in hfds:

$ hadoop fs -rmr /accumulo

Accumulo will create a new area to hold your configurations.  Accumulo will
basically abandon that old configuration.  There's a class that can be used
to clean up old accumulo instances in zookeeper:

$ ./bin/accumulo org.apache.accumulo.server.util.CleanZookeeper
hostname:port

Where "hostname:port" is the name of one of your zookeeper hosts.

-Eric



On Wed, Mar 27, 2013 at 10:29 AM, Aji Janis <aj...@gmail.com> wrote:

> Thanks Eric. But shouldn't I be cleaning up something in the hadoop-data
> directory too? and Zookeeper?
>
>
>
> On Wed, Mar 27, 2013 at 10:27 AM, Eric Newton <er...@gmail.com>wrote:
>
>> To re-initialize accumulo, bring up zookeeper and hdfs.
>>
>> $ hadoop fs -rmr /accumulo
>> $ ./bin/accumulo init
>>
>> I do this about 100 times a day on my dev box. :-)
>>
>> -Eric
>>
>>
>> On Wed, Mar 27, 2013 at 10:10 AM, Aji Janis <aj...@gmail.com> wrote:
>>
>>> Hello,
>>>
>>> We have the following set up:
>>>
>>> zookeeper - 3.3.3-1073969
>>> hadoop - 0.20.203.0
>>> accumulo - 1.4.2
>>>
>>> Our zookeeper crashed for some reason. I tried to doing a clean stop of
>>> everything and then brought up (in order) zookeeper and hadoop (cluster).
>>> But when trying to do a start-all on accumulo I get the following message
>>> gets infinitely printed to the screen:
>>>
>>> “26 12:45:43,551 [server.Accumulo] INFO : Waiting for accumulo to be
>>> initialized”
>>>
>>>
>>>
>>> Doing some digging on the web it seems that accumulo is hosed and needs
>>> some re-intialization. It also appears that may be I need to clean out
>>> things from zookeeper and hadoop prior to a re-initialization. Has any one
>>> done this before? Can someone please provide me some directions on what to
>>> do (or not to do)? Really appreciate help on this. Thanks.
>>>
>>
>>
>

Re: Waiting for accumulo to be initialized

Posted by Aji Janis <aj...@gmail.com>.

Thanks Eric. But shouldn't I be cleaning up something in the hadoop-data
directory too? and Zookeeper?



On Wed, Mar 27, 2013 at 10:27 AM, Eric Newton <er...@gmail.com> wrote:

> To re-initialize accumulo, bring up zookeeper and hdfs.
>
> $ hadoop fs -rmr /accumulo
> $ ./bin/accumulo init
>
> I do this about 100 times a day on my dev box. :-)
>
> -Eric
>
>
> On Wed, Mar 27, 2013 at 10:10 AM, Aji Janis <aj...@gmail.com> wrote:
>
>> Hello,
>>
>> We have the following set up:
>>
>> zookeeper - 3.3.3-1073969
>> hadoop - 0.20.203.0
>> accumulo - 1.4.2
>>
>> Our zookeeper crashed for some reason. I tried to doing a clean stop of
>> everything and then brought up (in order) zookeeper and hadoop (cluster).
>> But when trying to do a start-all on accumulo I get the following message
>> gets infinitely printed to the screen:
>>
>> “26 12:45:43,551 [server.Accumulo] INFO : Waiting for accumulo to be
>> initialized”
>>
>>
>>
>> Doing some digging on the web it seems that accumulo is hosed and needs
>> some re-intialization. It also appears that may be I need to clean out
>> things from zookeeper and hadoop prior to a re-initialization. Has any one
>> done this before? Can someone please provide me some directions on what to
>> do (or not to do)? Really appreciate help on this. Thanks.
>>
>
>

Re: Waiting for accumulo to be initialized

Posted by Eric Newton <er...@gmail.com>.

To re-initialize accumulo, bring up zookeeper and hdfs.

$ hadoop fs -rmr /accumulo
$ ./bin/accumulo init

I do this about 100 times a day on my dev box. :-)

-Eric


On Wed, Mar 27, 2013 at 10:10 AM, Aji Janis <aj...@gmail.com> wrote:

> Hello,
>
> We have the following set up:
>
> zookeeper - 3.3.3-1073969
> hadoop - 0.20.203.0
> accumulo - 1.4.2
>
> Our zookeeper crashed for some reason. I tried to doing a clean stop of
> everything and then brought up (in order) zookeeper and hadoop (cluster).
> But when trying to do a start-all on accumulo I get the following message
> gets infinitely printed to the screen:
>
> “26 12:45:43,551 [server.Accumulo] INFO : Waiting for accumulo to be
> initialized”
>
>
>
> Doing some digging on the web it seems that accumulo is hosed and needs
> some re-intialization. It also appears that may be I need to clean out
> things from zookeeper and hadoop prior to a re-initialization. Has any one
> done this before? Can someone please provide me some directions on what to
> do (or not to do)? Really appreciate help on this. Thanks.
>