You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Yonik Seeley (JIRA)" <ji...@apache.org> on 2015/11/03 00:08:27 UTC

[jira] [Commented] (SOLR-7802) TestDistributedStatsComponentCardinality failure

    [ https://issues.apache.org/jira/browse/SOLR-7802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14986230#comment-14986230 ] 

Yonik Seeley commented on SOLR-7802:
------------------------------------

{quote}
if you build 2 HLL instances, with different log2m settings, and add the exact same set of (raw) values to both, then the HLL with the larger log2m will give you the most accurate results then the HLL with a smaller log2m setting.
{quote}

Is that really true for any given set of raw values, or is it just true on average?
These are just estimates after all, and it would seem like a very difficult (and interesting) property to achieve what is seemingly claimed.  At first blush, it seems false.

> TestDistributedStatsComponentCardinality failure
> ------------------------------------------------
>
>                 Key: SOLR-7802
>                 URL: https://issues.apache.org/jira/browse/SOLR-7802
>             Project: Solr
>          Issue Type: Bug
>    Affects Versions: 5.3, Trunk
>            Reporter: Steve Rowe
>            Priority: Minor
>         Attachments: TestDistributedStatsComponentCardinality.tests-failures.txt
>
>
> Original trunk failure on Linux: [http://jenkins.sarowe.net/job/Lucene-Solr-tests-trunk/773/].  Reproduced with the repro line on OS X, both with trunk/Java8 and branch_5x/java7:
> {noformat}
>   [junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=TestDistributedStatsComponentCardinality -Dtests.method=test -Dtests.seed=87100DE827E75E41 -Dtests.slow=true -Dtests.locale=sr_RS -Dtests.timezone=Zulu -Dtests.asserts=true -Dtests.file.encoding=US-ASCII
> {noformat}
> {noformat}
> Stack Trace:
> java.lang.AssertionError: int_i: goodEst=13957, poorEst=13970, real=13980, p=q=id%3A%5B88+TO+14067%5D&rows=0&stats=true&stats.field=%7B%21cardinality%3D0.008936367747461982+key%3Dlow_int_i%7Dint_i&stats.field=%7B%21cardinality%3D0.008936367747461982+key%3Dlow_int_i_prehashed_l+hllPreHashed%3Dtrue%7Dint_i_prehashed_l&stats.field=%7B%21cardinality%3D0.508936367747462+key%3Dhigh_int_i%7Dint_i&stats.field=%7B%21cardinality%3D0.508936367747462+key%3Dhigh_int_i_prehashed_l+hllPreHashed%3Dtrue%7Dint_i_prehashed_l&stats.field=%7B%21cardinality%3D0.008936367747461982+key%3Dlow_long_l%7Dlong_l&stats.field=%7B%21cardinality%3D0.008936367747461982+key%3Dlow_long_l_prehashed_l+hllPreHashed%3Dtrue%7Dlong_l_prehashed_l&stats.field=%7B%21cardinality%3D0.508936367747462+key%3Dhigh_long_l%7Dlong_l&stats.field=%7B%21cardinality%3D0.508936367747462+key%3Dhigh_long_l_prehashed_l+hllPreHashed%3Dtrue%7Dlong_l_prehashed_l&stats.field=%7B%21cardinality%3D0.008936367747461982+key%3Dlow_string_s%7Dstring_s&stats.field=%7B%21cardinality%3D0.008936367747461982+key%3Dlow_string_s_prehashed_l+hllPreHashed%3Dtrue%7Dstring_s_prehashed_l&stats.field=%7B%21cardinality%3D0.508936367747462+key%3Dhigh_string_s%7Dstring_s&stats.field=%7B%21cardinality%3D0.508936367747462+key%3Dhigh_string_s_prehashed_l+hllPreHashed%3Dtrue%7Dstring_s_prehashed_l
> 	at __randomizedtesting.SeedInfo.seed([87100DE827E75E41:F443232891B33B9]:0)
> 	at org.junit.Assert.fail(Assert.java:93)
> 	at org.junit.Assert.assertTrue(Assert.java:43)
> 	at org.apache.solr.handler.component.TestDistributedStatsComponentCardinality.test(TestDistributedStatsComponentCardinality.java:216)
> [...]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org