You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by "Adam J Shook (JIRA)" <ji...@apache.org> on 2017/12/06 20:35:00 UTC

[jira] [Comment Edited] (ACCUMULO-4751) Some WALs don't replicate due to lacking a createdTime entry

    [ https://issues.apache.org/jira/browse/ACCUMULO-4751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16280853#comment-16280853 ] 

Adam J Shook edited comment on ACCUMULO-4751 at 12/6/17 8:34 PM:
-----------------------------------------------------------------

I have attached some logs tracking a particular WAL file.  You can see that it has a {{createdTime}} but at some point a deleting entry must be written (note the timestamp change but the {{createdTime}} is gone) and then other entries added.

And some other interesting messages back-to-back:
{code}
2017-12-06 19:55:37,712 [replication.StatusCombiner] TRACE: Returned single value: ~replhdfs://namenode:9000/accumulo/wal/tserver+31761/140223d6-30bd-41ae-a96d-d8af9884f85c stat:114 [] 14898338 false [begin: 0 end: 0 infiniteEnd: true closed: true createdTime: 1512589530002]
2017-12-06 19:55:37,712 [replication.StatusCombiner] TRACE: Returned single value: ~replhdfs://namenode:9000/accumulo/wal/tserver+31761/140223d6-30bd-41ae-a96d-d8af9884f85c stat:12l [] 14898372 false [begin: 0 end: 0 infiniteEnd: true closed: false]
{code}


was (Author: adamjshook):
I have attached some logs tracking a particular WAL file.  You can see that it has a {{createdTime}} but at some point a deleting entry must be written (note the timestamp change but the {{createdTime}} is gone) and then other entries added.

And some other interesting messages back-to-back:
{code}
2017-12-06 19:55:37,712 [replication.StatusCombiner] TRACE: Returned single value: ~replhdfs://dev-ob-Cluster/accumulo/wal/dob1-bvlt-r2n05.bloomberg.com+31761/140223d6-30bd-41ae-a96d-d8af9884f85c stat:114 [] 14898338 false [begin: 0 end: 0 infiniteEnd: true closed: true createdTime: 1512589530002]
2017-12-06 19:55:37,712 [replication.StatusCombiner] TRACE: Returned single value: ~replhdfs://dev-ob-Cluster/accumulo/wal/dob1-bvlt-r2n05.bloomberg.com+31761/140223d6-30bd-41ae-a96d-d8af9884f85c stat:12l [] 14898372 false [begin: 0 end: 0 infiniteEnd: true closed: false]
{code}

> Some WALs don't replicate due to lacking a createdTime entry
> ------------------------------------------------------------
>
>                 Key: ACCUMULO-4751
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-4751
>             Project: Accumulo
>          Issue Type: Bug
>    Affects Versions: 1.7.3, 1.8.1
>            Reporter: Adam J Shook
>            Assignee: Adam J Shook
>         Attachments: repl_logs.txt
>
>
> From what I can tell, the below error is thrown when no data for a particular table is written to a WAL, but the file is closed.  This would be because the {{Status}} entry from the {{StatusUtil}} for {{fileClosed}} is pre-built and therefore does not have a {{createdTime}}.  This prevents a WAL from being replicated until a {{createdTime}} entry is added manually.
> From the Accumulo master:
> {code}
> Status record ([begin: 0 end: 0 infiniteEnd: true closed: true]) for hdfs://namenode:9000/accumulo/wal/tserver.example.com+31732/f922df9c-3ffc-49ee-8d0c-261c7a05fea2 in table 7l was written to metadata table which lacked createdTime
> {code}
> There are two solutions I have in mind:
> 1. Update the {{StatusUtil}} such that every returned {{Status}} object sets the {{createdTime}} to {{System.currentTimeMillis}} if not explicitly given.
> 2. Update the Accumulo Master to set the {{createdTime}} to the WAL's modification time in HDFS if the WAL is closed but there is no {{createdTime}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)