You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@accumulo.apache.org by "Ligade, Shailesh [USA]" <Li...@bah.com> on 2021/08/05 11:54:58 UTC

hdfs rack awareness and accumulo

Hello,

Our hdfs setup is rack aware with replication of 3. The datanode and tserver share the same hosts. In the event that one rack goes down, will accumulo be still functioning (after hdfs data replication)?

What I am finding is accumulo monitor is up and showing half the tablets are unreachable, I can get to accumulo shell but I can’t scan any tables. From the log I can see there are some locks in zookeeper. But overall accumulo, although up, is not usable ☹ Is there any way around it?

-S

Re: RE: [External] RE: hdfs rack awareness and accumulo

Posted by Ed Coleman <ed...@apache.org>.

These are coming from the troubleshooting section in the Accumulo documentation (https://accumulo.apache.org/docs/2.x/troubleshooting/basic) The advanced section has quite a lot of hints concerning hdfs issues.

Is the hdfs file system up an stable?  Can you run:

hadoop fsck /accumulo/path/to/corrupt/file -locations -blocks -files

Can you run 

accumulo admin checkTablets

and / or

accumulo org.apache.accumulo.server.util.FindOfflineTablets

Using the shell, can you scan the root table?  the metadata table?  If not, you can set the logging level of the shell to trace and then run the scan - that can provide info on what is hanging.

On 2021/08/05 12:45:09, "Ligade, Shailesh [USA]" <Li...@bah.com> wrote: 
> All tables are online (except for replication, which was off to begin with)
> 
> Master process is up
> 
> I see INFO messages like
> 
> Failed to open transport
> Waiting for file to be closed /accumulo/wal/<stooped hostname>/xxx
> RecoveryManager Volume replaced /accumulo/wal/stopped hostname>/xxx
> 
> Is there anything specific I should be looking for?
> 
> -S
> 
> From: dev1@etcoleman.com <de...@etcoleman.com>
> Sent: Thursday, August 5, 2021 8:37 AM
> To: user@accumulo.apache.org
> Subject: [External] RE: hdfs rack awareness and accumulo
> 
> Are there any Accumulo system table that are offline (root, metadata)?  Is there a manager (master) process available?  What is the manager log saying?
> 
> From: Ligade, Shailesh [USA] <Li...@bah.com>>
> Sent: Thursday, August 5, 2021 7:55 AM
> To: user@accumulo.apache.org<ma...@accumulo.apache.org>
> Subject: hdfs rack awareness and accumulo
> 
> Hello,
> 
> Our hdfs setup is rack aware with replication of 3. The datanode and tserver share the same hosts. In the event that one rack goes down, will accumulo be still functioning (after hdfs data replication)?
> 
> What I am finding is accumulo monitor is up and showing half the tablets are unreachable, I can get to accumulo shell but I can’t scan any tables. From the log I can see there are some locks in zookeeper. But overall accumulo, although up, is not usable ☹ Is there any way around it?
> 
> -S
>

RE: [External] RE: hdfs rack awareness and accumulo

Posted by "Ligade, Shailesh [USA]" <Li...@bah.com>.

All tables are online (except for replication, which was off to begin with)

Master process is up

I see INFO messages like

Failed to open transport
Waiting for file to be closed /accumulo/wal/<stooped hostname>/xxx
RecoveryManager Volume replaced /accumulo/wal/stopped hostname>/xxx

Is there anything specific I should be looking for?

-S

From: dev1@etcoleman.com <de...@etcoleman.com>
Sent: Thursday, August 5, 2021 8:37 AM
To: user@accumulo.apache.org
Subject: [External] RE: hdfs rack awareness and accumulo

Are there any Accumulo system table that are offline (root, metadata)?  Is there a manager (master) process available?  What is the manager log saying?

From: Ligade, Shailesh [USA] <Li...@bah.com>>
Sent: Thursday, August 5, 2021 7:55 AM
To: user@accumulo.apache.org<ma...@accumulo.apache.org>
Subject: hdfs rack awareness and accumulo

Hello,

Our hdfs setup is rack aware with replication of 3. The datanode and tserver share the same hosts. In the event that one rack goes down, will accumulo be still functioning (after hdfs data replication)?

What I am finding is accumulo monitor is up and showing half the tablets are unreachable, I can get to accumulo shell but I can’t scan any tables. From the log I can see there are some locks in zookeeper. But overall accumulo, although up, is not usable ☹ Is there any way around it?

-S

RE: hdfs rack awareness and accumulo

Posted by de...@etcoleman.com.

Are there any Accumulo system table that are offline (root, metadata)?  Is there a manager (master) process available?  What is the manager log saying?

From: Ligade, Shailesh [USA] <Li...@bah.com> 
Sent: Thursday, August 5, 2021 7:55 AM
To: user@accumulo.apache.org
Subject: hdfs rack awareness and accumulo

Hello,

Our hdfs setup is rack aware with replication of 3. The datanode and tserver share the same hosts. In the event that one rack goes down, will accumulo be still functioning (after hdfs data replication)?

What I am finding is accumulo monitor is up and showing half the tablets are unreachable, I can get to accumulo shell but I can’t scan any tables. From the log I can see there are some locks in zookeeper. But overall accumulo, although up, is not usable ☹ Is there any way around it?

-S