You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "Jaroslaw Odzga (JIRA)" <ji...@apache.org> on 2011/03/11 16:26:59 UTC

[jira] Updated: (MAHOUT-625) Some of generated patterns have support higher than in reality

     [ https://issues.apache.org/jira/browse/MAHOUT-625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jaroslaw Odzga updated MAHOUT-625:
----------------------------------

    Attachment: mahout-test.7z

Test which shows that FPGrowth has a bug. The data for test is from http://fimi.ua.ac.be/data/ (retail).

> Some of generated patterns have support higher than in reality
> --------------------------------------------------------------
>
>                 Key: MAHOUT-625
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-625
>             Project: Mahout
>          Issue Type: Bug
>          Components: Frequent Itemset/Association Rule Mining
>    Affects Versions: 0.4
>            Reporter: Jaroslaw Odzga
>            Priority: Critical
>         Attachments: mahout-test.7z
>
>
> It turnes out that some of generated patterns have incorrect support. The returned support is slightly higher than the true one.
> I attached the test, which proves that FPGrowth has a bug. Test is using data (retail) found here: http://fimi.ua.ac.be/data/
> The pattern (36, 39, 41) occurs in the transactions 572 times (this is also calculated in test), but the FPGrowth returns pattern (36, 39, 41) with support 573.
> Please note that mentioned pattern is not the only one with incorrect support - the test only point out one example to hace something to focus on. There is plenty more patterns with support higher than the real one. The biggest difference I noticed was support 8 higher than the real one for one of patterns.
> Please find attached failing unit test - it's actually a maven project, which contains test data and is ready to run.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira