You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by "Hayden Marchant (JIRA)" <ji...@apache.org> on 2014/06/24 15:56:25 UTC
[jira] [Updated] (ACCUMULO-2942) org.apache.accumulo.core.util.format.ShardedTableDistributionFormatterTest.testAggregate failure

     [ https://issues.apache.org/jira/browse/ACCUMULO-2942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hayden Marchant updated ACCUMULO-2942:
--------------------------------------

    Description: 
org.apache.accumulo.core.util.format.ShardedTableDistributionFormatterTest.testAggregate . This fails on IBM JRE, since the test is asserting order of elements in a HashMap. This consistently passes on Sun , and consistently fails on Oracle. 

The ShardedTableDistributionFormatter inherits from AggregatingFormatter which has 2 overriding methods - aggregateStats and getStats. In the ShardedTableDistributionFormatter implementation, the aggregateStats prepares a list based on the HashMap, and the getStats creates a string by serializing values in the HashMap. 

Due to the unpredictability of Hash ordering in different Java versions (even same vendor, different versions), the getStats() output is inconsistent. This is not a problem in itself. However since we are asserting on the content of getStats, we we either make the getStatus consistent or we do some refactoring and do 2 tests - one test on the structure that getStats is serializing, and another test to assert the output of getStats based on a predictable structure.

Some people expressed concern for changing the underlying structure from a HashMap to TreeMap due to performance considerations. Question is, is this code ever executed in such an environment to be concerned about this?

Alternatively, we could just change the getStats method, which is after the 'heavy-lifting' of iterating over all entries. The stats that are calculated are aggregates per day. Therefore this will not be such a large structure, and could then be sorted before being output.


  was:
org.apache.accumulo.core.util.format.ShardedTableDistributionFormatterTest.testAggregate . This fails on IBM JRE, since the test is asserting order of elements in a HashMap. This consistently passes on Sun , and consistently fails on Oracle. 

Proposal: Change ShardedTableDistributionFormatter.countsByDay to TreeMap, or use non-ordered comparison 


> org.apache.accumulo.core.util.format.ShardedTableDistributionFormatterTest.testAggregate failure
> ------------------------------------------------------------------------------------------------
>
>                 Key: ACCUMULO-2942
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-2942
>             Project: Accumulo
>          Issue Type: Sub-task
>          Components: tserver
>    Affects Versions: 1.6.0
>         Environment: IBM JVM
>            Reporter: Hayden Marchant
>             Fix For: 1.6.1, 1.7.0
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> org.apache.accumulo.core.util.format.ShardedTableDistributionFormatterTest.testAggregate . This fails on IBM JRE, since the test is asserting order of elements in a HashMap. This consistently passes on Sun , and consistently fails on Oracle. 
> The ShardedTableDistributionFormatter inherits from AggregatingFormatter which has 2 overriding methods - aggregateStats and getStats. In the ShardedTableDistributionFormatter implementation, the aggregateStats prepares a list based on the HashMap, and the getStats creates a string by serializing values in the HashMap. 
> Due to the unpredictability of Hash ordering in different Java versions (even same vendor, different versions), the getStats() output is inconsistent. This is not a problem in itself. However since we are asserting on the content of getStats, we we either make the getStatus consistent or we do some refactoring and do 2 tests - one test on the structure that getStats is serializing, and another test to assert the output of getStats based on a predictable structure.
> Some people expressed concern for changing the underlying structure from a HashMap to TreeMap due to performance considerations. Question is, is this code ever executed in such an environment to be concerned about this?
> Alternatively, we could just change the getStats method, which is after the 'heavy-lifting' of iterating over all entries. The stats that are calculated are aggregates per day. Therefore this will not be such a large structure, and could then be sorted before being output.



--
This message was sent by Atlassian JIRA
(v6.2#6252)