You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by "Josh Elser (JIRA)" <ji...@apache.org> on 2016/08/03 17:57:20 UTC
[jira] [Created] (ACCUMULO-4398) Possible for client to see
TableNotFoundException adding splits on a newly created table
Josh Elser created ACCUMULO-4398:
------------------------------------
Summary: Possible for client to see TableNotFoundException adding splits on a newly created table
Key: ACCUMULO-4398
URL: https://issues.apache.org/jira/browse/ACCUMULO-4398
Project: Accumulo
Issue Type: Bug
Components: client, zookeeper
Reporter: Josh Elser
Just came across a really odd scenario. I believe that it's a race condition in the client that stems from our beloved {{ZooCache}}.
This was observed via a test failure in {{LogicalTimeIT}}:
{noformat}
Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 29.249 sec <<< FAILURE! - in org.apache.accumulo.test.functional.LogicalTimeIT
run(org.apache.accumulo.test.functional.LogicalTimeIT) Time elapsed: 29.037 sec <<< ERROR!
org.apache.accumulo.core.client.TableNotFoundException: Table LogicalTimeIT_run06 does not exist
at org.apache.accumulo.core.client.impl.Tables._getTableId(Tables.java:117)
at org.apache.accumulo.core.client.impl.Tables.getTableId(Tables.java:102)
at org.apache.accumulo.core.client.impl.TableOperationsImpl.addSplits(TableOperationsImpl.java:374)
at org.apache.accumulo.test.functional.LogicalTimeIT.runMergeTest(LogicalTimeIT.java:81)
at org.apache.accumulo.test.functional.LogicalTimeIT.run(LogicalTimeIT.java:56)
{noformat}
Ultimately:
{code}
conn.tableOperations().create(table, new NewTableConfiguration().setTimeType(TimeType.LOGICAL));
TreeSet<Text> splitSet = new TreeSet<Text>();
for (String split : splits) {
splitSet.add(new Text(split));
}
conn.tableOperations().addSplits(table, splitSet);
{code}
The important piece to remember is that a ZooKeeper client, when a watcher is set, will eventually get all updates from that watcher in the order which they occurred. LogicalTimeIT is repeatedly running the same test over tables of varying characteristics. I think these are the important points.
Consider the following:
# Client creates a table T1
# ZooCache is cleared after FATE op finishes
# Watcher is set on ZTABLES in ZK
# Client interacts with T1
# Client creates T2
# ZooCache is cleared after FATE op finishes
# Watcher fires on ZTABLES node in ZK (CHILDREN_CHANGED) and repopulates the child list on the ZTABLES node
# Client makes call to split T2
# Code will check if the table exists, but the childrenCache will be repopulated in ZooCache which will cause the client to think the table doesn't exit
# Eventually, the watcher would fire and ZTABLES would be updated and everything is fine.
I believe this is a plausible scenario, however perhaps unlikely.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)