You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by "Keith Turner (JIRA)" <ji...@apache.org> on 2014/09/02 19:07:21 UTC

[jira] [Updated] (ACCUMULO-3096) Scans stuck and seeing error message about constraint violation

     [ https://issues.apache.org/jira/browse/ACCUMULO-3096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Keith Turner updated ACCUMULO-3096:
-----------------------------------
    Attachment: ACCUMULO-3096-1.6.1-SNAPSHOT-1.patch

> Scans stuck and seeing error message about constraint violation
> ---------------------------------------------------------------
>
>                 Key: ACCUMULO-3096
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-3096
>             Project: Accumulo
>          Issue Type: Bug
>    Affects Versions: 1.6.0
>            Reporter: Keith Turner
>            Assignee: Keith Turner
>             Fix For: 1.6.1, 1.7.0
>
>         Attachments: ACCUMULO-3096-1.6.1-SNAPSHOT-1.patch
>
>
> Just helped someone debug an issue. Their scans were getting stuck on a certain tserver (determined tserver by turning on debug in shell).  On the tserver, there was a contant stream of messages about a metadata table contstraint violate because {{Bulk load transaction no longer running}}.
> The following code in {{Tablet.importMapFiles()}} 
> {code:java}
>           synchronized (timeLock) {
>             if (bulkTime > persistedTime)
>               persistedTime = bulkTime;
>             MetadataTableUtil.updateTabletDataFile(tid, extent, paths, tabletTime.getMetadataValue(persistedTime), creds, tabletServer.getLock());
>           }
> {code}
> Ended up calling the following code in {{MetadataTableUtil}}.  
> {code:java}
> public static void update(Credentials credentials, ZooLock zooLock, Mutation m, KeyExtent extent) {
>     Writer t = extent.isMeta() ? getRootTable(credentials) : getMetadataTable(credentials);
>     if (zooLock != null)
>       putLockID(zooLock, m);
>     while (true) {
>       try {
>         t.update(m);
>         return;
>       } catch (AccumuloException e) {
>         log.error(e, e);
>       } catch (AccumuloSecurityException e) {
>         log.error(e, e);
>       } catch (ConstraintViolationException e) {
>         log.error(e, e);
>       } catch (TableNotFoundException e) {
>         log.error(e, e);
>       }
>       UtilWaitThread.sleep(1000);
>     }
>   }
> {code}
> So when the constraint failed, it retried forever.   It did this while holding timeLock, which in turn prevented compactions from completing, which eventually gummed up scans.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)