You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@accumulo.apache.org by Jeff N <ma...@gmail.com> on 2015/05/27 22:48:47 UTC

Weird table state

I have a table where all of it's HDFS data was been wiped and all of the
information within the metadata table about said table has been wiped, but
the monitor and shell will still list the table. The table is showing as
online but with dashes for tablets/entries/etc. All of my interactions with
the table lead to the shell hanging on IO when talking to Master. Is there
any way to wipe the table listing from another place?





-----



--
View this message in context: http://apache-accumulo.1065345.n5.nabble.com/Weird-table-state-tp14279.html
Sent from the Developers mailing list archive at Nabble.com.

Re: Weird table state

Posted by Josh Elser <jo...@gmail.com>.
Sweet, that was the one I had in mind. Glad you found it :)

If you're able and willing to move to 1.6.x release, that is likely the 
best decision and jives with our current release plan. Thanks for the 
feedback.

Jeff N wrote:
> The main ticket I found was Accumulo-3182 which basically told me if I was
> using 1.6.2 when my namenode went down I wouldn't have had to manually scrub
> the problem data from all my nodes. We run our master/slaves/zookeeper
> through a virtualization layer and this has been the root of some of our
> problems. The goal is to move to the 1.6.2, at minimum, distro and do away
> with some of the virtualization so that there isn't ever an issue with
> communication between nodes. If there is any other information I can provide
> I'm happy to help.
>
>
>
>
> -----
>
>
>
> --
> View this message in context: http://apache-accumulo.1065345.n5.nabble.com/Weird-table-state-tp14279p14288.html
> Sent from the Developers mailing list archive at Nabble.com.

Re: Weird table state

Posted by Jeff N <ma...@gmail.com>.
The main ticket I found was Accumulo-3182 which basically told me if I was
using 1.6.2 when my namenode went down I wouldn't have had to manually scrub
the problem data from all my nodes. We run our master/slaves/zookeeper
through a virtualization layer and this has been the root of some of our
problems. The goal is to move to the 1.6.2, at minimum, distro and do away
with some of the virtualization so that there isn't ever an issue with
communication between nodes. If there is any other information I can provide
I'm happy to help.
    



-----



--
View this message in context: http://apache-accumulo.1065345.n5.nabble.com/Weird-table-state-tp14279p14288.html
Sent from the Developers mailing list archive at Nabble.com.

Re: Weird table state

Posted by Josh Elser <jo...@gmail.com>.
Any elaboration you can give here? We recently tried to poll the 
community about EOL'ing the 1.5 series. Due to lack of response, we 
decided to cut a 1.5.3 and consider that line done.

Christopher has been going through tickets triaging and picking 
low-hanging fruit back to that branch. Let us know what you're stuck on 
and if we can help make a useful release to alleviate some of the pain?

Jeff N wrote:
> Thanks. Yeah, the goal is move towards using a more recent version of
> Accumulo because in searching for answers to the problems I've been
> experiencing the past few days I came across a lot of JIRA board tickets
> saying the problems were resolved in later versions.
>
>
>
> -----
>
>
>
> --
> View this message in context: http://apache-accumulo.1065345.n5.nabble.com/Weird-table-state-tp14279p14283.html
> Sent from the Developers mailing list archive at Nabble.com.

Re: Weird table state

Posted by Jeff N <ma...@gmail.com>.
Thanks. Yeah, the goal is move towards using a more recent version of
Accumulo because in searching for answers to the problems I've been
experiencing the past few days I came across a lot of JIRA board tickets
saying the problems were resolved in later versions.



-----



--
View this message in context: http://apache-accumulo.1065345.n5.nabble.com/Weird-table-state-tp14279p14283.html
Sent from the Developers mailing list archive at Nabble.com.

Re: Weird table state

Posted by Josh Elser <jo...@gmail.com>.

Jeff N wrote:
> It was part of a manual repair. We had a namenode crash and recovery failed.
> The process of moving the hdfs data out and importing into another table
> left the original table in a weird state. The recovery data had 0B and was
> throwing EOF exceptions.
>

EOF on WALs with no data was something that we changed in "recent" 
versions (maybe 1.6.2 and 1.7.0?). We should skip over a WAL with no 
data in it. That might help your situation should you ever be so 
unfortunate to run into it again.


>
>
> -----
>
>
>
> --
> View this message in context: http://apache-accumulo.1065345.n5.nabble.com/Weird-table-state-tp14279p14281.html
> Sent from the Developers mailing list archive at Nabble.com.

Re: Weird table state

Posted by Jeff N <ma...@gmail.com>.
It was part of a manual repair. We had a namenode crash and recovery failed.
The process of moving the hdfs data out and importing into another table
left the original table in a weird state. The recovery data had 0B and was
throwing EOF exceptions.



-----



--
View this message in context: http://apache-accumulo.1065345.n5.nabble.com/Weird-table-state-tp14279p14281.html
Sent from the Developers mailing list archive at Nabble.com.

Re: Weird table state

Posted by Jeff N <ma...@gmail.com>.
Worked like a charm! Thanks for the help!



-----



--
View this message in context: http://apache-accumulo.1065345.n5.nabble.com/Weird-table-state-tp14279p14292.html
Sent from the Developers mailing list archive at Nabble.com.

Re: Weird table state

Posted by Josh Elser <jo...@gmail.com>.
No worries, fine to ask here. You're likely running into the ZK ACL we 
set to prevent unauthenticated users from removing things from ZK. This 
is where the value you set for instance.secret comes in.

In zkCli.sh, before running your command, run the following:

addauth digest accumulo:$instance_secret

where $instance_secret is the value of instance.secret you set in 
accumulo-site.xml.

The command should execute successfully (without any warning output). 
After, you should be able to delete to your heart's content in ZK.

Jeff N wrote:
> Just a quick question to perhaps avoid Internet scouring, but I've sudo'd
> into the zk user from root on the appropriate node and trying to delete the
> node for the zombie table throws a "not authorized." If this question is
> inappropriate for this board then I'll delete.
>
> Using ZooKeeper 3.4.5 and the following command:
>      del -r -f /accumulo/<instance_id>/tables/<table_id>
>
> Thoughts?
>
>
>
> -----
>
>
>
> --
> View this message in context: http://apache-accumulo.1065345.n5.nabble.com/Weird-table-state-tp14279p14290.html
> Sent from the Developers mailing list archive at Nabble.com.

Re: Weird table state

Posted by Jeff N <ma...@gmail.com>.
Just a quick question to perhaps avoid Internet scouring, but I've sudo'd
into the zk user from root on the appropriate node and trying to delete the
node for the zombie table throws a "not authorized." If this question is
inappropriate for this board then I'll delete. 

Using ZooKeeper 3.4.5 and the following command:
    del -r -f /accumulo/<instance_id>/tables/<table_id>

Thoughts?



-----



--
View this message in context: http://apache-accumulo.1065345.n5.nabble.com/Weird-table-state-tp14279p14290.html
Sent from the Developers mailing list archive at Nabble.com.

Re: Weird table state

Posted by Christopher <ct...@apache.org>.
The canonical "existence" and "goal state" (online/offline) for a
table is stored in ZooKeeper. You'd have to delete
/accumulo/<instanceId>/tables/<tableId> for the table in ZK.

Out of curiosity, what version did this occur in? Was it a failed FATE
operation to delete the table, or some corruption/manual repair?

--
Christopher L Tubbs II
http://gravatar.com/ctubbsii


On Wed, May 27, 2015 at 4:48 PM, Jeff N <ma...@gmail.com> wrote:
> I have a table where all of it's HDFS data was been wiped and all of the
> information within the metadata table about said table has been wiped, but
> the monitor and shell will still list the table. The table is showing as
> online but with dashes for tablets/entries/etc. All of my interactions with
> the table lead to the shell hanging on IO when talking to Master. Is there
> any way to wipe the table listing from another place?
>
>
>
>
>
> -----
>
>
>
> --
> View this message in context: http://apache-accumulo.1065345.n5.nabble.com/Weird-table-state-tp14279.html
> Sent from the Developers mailing list archive at Nabble.com.