You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Mikhail Antonov (JIRA)" <ji...@apache.org> on 2016/07/03 10:08:11 UTC

[jira] [Commented] (HBASE-16074) ITBLL fails, reports lost big or tine families

    [ https://issues.apache.org/jira/browse/HBASE-16074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15360492#comment-15360492 ] 

Mikhail Antonov commented on HBASE-16074:
-----------------------------------------

[~stack]

so I gave up the the idea of reproducing this on minicluster - this is pretty unreliable, and with the number of iterations required and the time it takes to run reasonable number of local ITBLL iterations it's not really that much faster than doing manual bisect and redeploying it every time on real cluster.

So I started off with 17b39763c96aaca45208cba0ce9ce4fb931eb959 (when branch-1.3 was cut off branch-1, this one is good) and after 7-8 steps it looks like this is the commit that causes problem - 4b69faa1903303419dfcf027a2268524816c7a35, HBASE-15650. The previous commit doesn't seem to loose any data on verify step on multiple iterations, this one lost like 3 out of 4 consequent runs. Looking why things break.



> ITBLL fails, reports lost big or tine families
> ----------------------------------------------
>
>                 Key: HBASE-16074
>                 URL: https://issues.apache.org/jira/browse/HBASE-16074
>             Project: HBase
>          Issue Type: Bug
>          Components: integration tests
>    Affects Versions: 1.3.0
>            Reporter: Mikhail Antonov
>            Assignee: Mikhail Antonov
>            Priority: Blocker
>             Fix For: 1.3.0
>
>         Attachments: changes_to_stress_ITBLL.patch, changes_to_stress_ITBLL__a_bit_relaxed_.patch, itbll log with failure, itbll log with success
>
>
> Underlying MR jobs succeed but I'm seeing the following in the logs (mid-size distributed test cluster):
> ERROR test.IntegrationTestBigLinkedList$Verify: Found nodes which lost big or tiny families, count=164
> I do not know exactly yet whether it's a bug, a test issue or env setup issue, but need figure it out. Opening this to raise awareness and see if someone saw that recently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)