You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by Hayri Volkan Agun <vo...@gmail.com> on 2015/08/05 15:19:43 UTC

Label based MLLib MulticlassMetrics is buggy

The results in MulticlassMetrics is totally wrong. They are improperly
calculated.
Confusion matrix may be true I don't know but for each label scores are
wrong.

-- 
Hayri Volkan Agun
PhD. Student - Anadolu University

Re: Label based MLLib MulticlassMetrics is buggy

Posted by Feynman Liang <fl...@databricks.com>.

1.5 has not yet been released; what is the commit hash that you are
building?

On Wed, Aug 5, 2015 at 10:29 AM, Hayri Volkan Agun <vo...@gmail.com>
wrote:

> Hi,
>
> In Spark 1.5 I saw a result for precision 1.0 and recall 0.01 for decision
> tree classification.
> While precision a hundred the recall shouldn't be so small...I checked the
> code, everything seems ok
> but why I got such a result is unexplainable. As far as I understand from
> scala code the row sum is the actual
> class counts, the column sum is predictions sum am I right?
> I am doing additional tests for comparison with my own code...
> I attached a document for my reuters tests on page 3.
>
>
> On Wed, Aug 5, 2015 at 7:57 PM, Feynman Liang <fl...@databricks.com>
> wrote:
>
>> Also, what version of Spark are you using?
>>
>> On Wed, Aug 5, 2015 at 9:57 AM, Feynman Liang <fl...@databricks.com>
>> wrote:
>>
>>> Hi Hayri,
>>>
>>> Can you provide a sample of the expected and actual results?
>>>
>>> Feynman
>>>
>>> On Wed, Aug 5, 2015 at 6:19 AM, Hayri Volkan Agun <vo...@gmail.com>
>>> wrote:
>>>
>>>> The results in MulticlassMetrics is totally wrong. They are improperly
>>>> calculated.
>>>> Confusion matrix may be true I don't know but for each label scores are
>>>> wrong.
>>>>
>>>> --
>>>> Hayri Volkan Agun
>>>> PhD. Student - Anadolu University
>>>>
>>>
>>>
>>
>
>
> --
> Hayri Volkan Agun
> PhD. Student - Anadolu University
>

Re: Label based MLLib MulticlassMetrics is buggy

Posted by Feynman Liang <fl...@databricks.com>.

Also, what version of Spark are you using?

On Wed, Aug 5, 2015 at 9:57 AM, Feynman Liang <fl...@databricks.com> wrote:

> Hi Hayri,
>
> Can you provide a sample of the expected and actual results?
>
> Feynman
>
> On Wed, Aug 5, 2015 at 6:19 AM, Hayri Volkan Agun <vo...@gmail.com>
> wrote:
>
>> The results in MulticlassMetrics is totally wrong. They are improperly
>> calculated.
>> Confusion matrix may be true I don't know but for each label scores are
>> wrong.
>>
>> --
>> Hayri Volkan Agun
>> PhD. Student - Anadolu University
>>
>
>

Re: Label based MLLib MulticlassMetrics is buggy

Posted by Feynman Liang <fl...@databricks.com>.

Hi Hayri,

Can you provide a sample of the expected and actual results?

Feynman

On Wed, Aug 5, 2015 at 6:19 AM, Hayri Volkan Agun <vo...@gmail.com>
wrote:

> The results in MulticlassMetrics is totally wrong. They are improperly
> calculated.
> Confusion matrix may be true I don't know but for each label scores are
> wrong.
>
> --
> Hayri Volkan Agun
> PhD. Student - Anadolu University
>