You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Lars Hofhansl (JIRA)" <ji...@apache.org> on 2013/02/26 05:48:14 UTC
[jira] [Commented] (HBASE-7700) TestColumnSeeking is mathematically
bound to fail
[ https://issues.apache.org/jira/browse/HBASE-7700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13586749#comment-13586749 ]
Lars Hofhansl commented on HBASE-7700:
--------------------------------------
+1 on patch. Randomness is fine as long as it is repeatable (as Elliot says) and we do not rely on any specific distribution.
This is a good short-term fix; I'll will commit this tomorrow, unless I hear a strong case against it.
We can also file a followup issue to change/remove the randomness in favor of a different mechanism.
> TestColumnSeeking is mathematically bound to fail
> -------------------------------------------------
>
> Key: HBASE-7700
> URL: https://issues.apache.org/jira/browse/HBASE-7700
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.96.0, 0.94.4
> Reporter: Jean-Daniel Cryans
> Assignee: Jean-Daniel Cryans
> Fix For: 0.96.0, 0.94.6
>
> Attachments: HBASE-7700-0.94-lazyfix.patch
>
>
> First I'd like to say that TestColumnSeeking is a bad test. It's not documented, it's non-deterministic, it's 2 methods with almost the same code.
> So in each test it populates column lists this way:
> {code}
> for (int i = 0; i < numberOfTests; i++) {
> kvMaps[i] = new HashMap<String, KeyValue>();
> columnLists[i] = new ArrayList<String>();
> for (String column : allColumns) {
> if (Math.random() < selectPercent) {
> columnLists[i].add(column);
> }
> }
> }
> {code}
> Since selectPercent is 50% and there are 10 columns, there's something like a 1/1024 chance that one of the column list ends up with 0 column. This is later mismanaged in the checks. First something like this will be printed out:
> bq. 2013-01-28 11:50:02,200 INFO [pool-1-thread-1] regionserver.TestColumnSeeking(140): Columns: 0 Keys: 0
> Like it says, there's 0 columns so it couldn't add data. But then it still makes sure later that the data is there with this check:
> {code}
> assertEquals(kvSet.size(), results.size());
> {code}
> Do notice that the parameters are reversed, and here the results.size() will be 0 since there are 0 columns for this test.
> I see multiple ways to fix this:
> - Skip tests that have 0 columns
> - Change the randomness to at least have 1 column (like select 1 + 0..9 columns)
> - Redo the whole unit test to not rely on randomness
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira