You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Thomas Söhngen <th...@beluto.com> on 2011/05/09 02:14:10 UTC

Understanding log-likelihood

Hello,

I struggle to understand the log-likelihood function. I would highly 
welcome a simple example of how it is calculated, especially in Mahout.

Thanks in advance,
Thomas

Re: Understanding log-likelihood

Posted by Ted Dunning <te...@gmail.com>.
Try this:

http://tdunning.blogspot.com/2008/03/surprise-and-coincidence.html

On Sun, May 8, 2011 at 5:14 PM, Thomas Söhngen <th...@beluto.com> wrote:

> Hello,
>
> I struggle to understand the log-likelihood function. I would highly
> welcome a simple example of how it is calculated, especially in Mahout.
>
> Thanks in advance,
> Thomas
>

Re: Understanding log-likelihood

Posted by Sean Owen <sr...@gmail.com>.
I can try to explain one understanding of the meaning, though it is
not really the intuitive explanation of the formulation in Mahout,
rather a somewhat different one I originally used. And even 80%
understand.

Two users are similar when they rate or are associated to many of the
same items. However, a certain overlap may or may not be meaningful --
it could be due to chance, or due to the fact that we have similar
tastes. For example if you and I have rated 100 items each, and 50
overlap, we're probably similar. But if we've each rated 1000 and
overlap in only 50, maybe we're not.

The log-likelihood metric is just trying to formally quantify how
unlikely it is that our overlap is due to chance. The less likely, the
more similar we are.

So it is comparing two likelihoods, and just looking at their ratio.
The numerator likelihood is the null hypothesis: we're not similar and
overlap is due to chance. The denominator is the likelihood that it's
not at all due to chance -- that the overlap is completely explained
is perfectly explained because our tastes are similar and the overlap
is exactly what you'd expect given that.

When the numerator is relatively small, the null hypothesis is
relatively unlikely, so we are similar.

The reason the formulation typically then takes -2.0 * log (likelihood
ratio) is by convention, and it makes the result a bit more useful in
two ways. One, more similarity will equal a higher log-likelihood,
which is perhaps more intuitive than the likelihood ratio which is
lower when similarity is higher. But the real reason is that the
log-likelhood value then follows a chi-squared distribution and the
result can be used to actually figure a probability that the users are
similar or not. (We don't use it that way in Mahout though.)

And Ted' formulation, which is also right and quite tidy and the one
used in the project, is based on Shannon entropy. I understand it, I
believe, but would have to think more about an intuitive explanation.

It is, similarly, trying to figure out whether the co-occurrences are
"unusually" frequent by asking whether there is any additional
information to be gained by looking at user 1 and user 2's preferences
separately versus everything at once. If there is, then there is
something specially related about user 1 and user 2 and they're
similar.


On Mon, May 9, 2011 at 3:09 AM, Thomas Söhngen <th...@beluto.com> wrote:
> Thank you for the explanation. I can understand the calculations now, but I
> still don't get the meaning. I think I'll try to sleep a night over it and
> try again tomorrow.
>
> Best regards,
> Thomas
>
> Am 09.05.2011 03:42, schrieb Ted Dunning:
>>
>> In this notation, k is assumed to be a matrix.  k_11 is the element in the
>> first row and first column.
>>
>> I used k to sound like count.
>>
>> The notation that you quote is R syntax.  rowSums is a function that
>> computes the row-wise sums of the argument k.  H is a function defined
>> elsewhere.
>>
>> On Sun, May 8, 2011 at 6:33 PM, Thomas Söhngen<th...@beluto.com>  wrote:
>>
>>> Thank you for the blog post and showing me the G-test formula.
>>>
>>> After going through your blog post, I still have some open questions: You
>>> introduce k_11 to k_22, but I don't understand what "k" itself actually
>>> stands for in your formular and how the sums are defined: LLR = 2 sum(k)
>>> (H(k) - H(rowSums(k)) - H(colSums(k)))
>>>
>>> Am 09.05.2011 02:46, schrieb Ted Dunning:
>>>
>>>  My guess is that the OP was asking about the generalized log-likelihood
>>>>
>>>> ratio test used in the Mahout recommendation framework.
>>>>
>>>> That is a bit different from what you describe in that it is the log of
>>>> the
>>>> ratio of two maximum likelihoods.
>>>>
>>>> See http://en.wikipedia.org/wiki/G-test for a definition of the test
>>>> used
>>>> in
>>>> Mahout.
>>>>
>>>> On Sun, May 8, 2011 at 5:43 PM, Jeremy Lewi<je...@lewi.us>   wrote:
>>>>
>>>>  Thomas,
>>>>>
>>>>> Are you asking a general question about log-likelihood or a specific
>>>>> implementation usage in Mahout?
>>>>>
>>>>> In general the likelihood is just a number, between 0 and 1 which
>>>>> measures the probability of observing some data under some
>>>>> distribution.
>>>>>
>>>>>
>>>>>
>

Re: Understanding log-likelihood

Posted by Ted Dunning <te...@gmail.com>.
Well, you have good company.

Meaning eludes us all in some sense.

On Sun, May 8, 2011 at 7:09 PM, Thomas Söhngen <th...@beluto.com> wrote:

> I can understand the calculations now, but I still don't get the meaning.

Re: Understanding log-likelihood

Posted by Thomas Söhngen <th...@beluto.com>.
Thank you for the explanation. I can understand the calculations now, 
but I still don't get the meaning. I think I'll try to sleep a night 
over it and try again tomorrow.

Best regards,
Thomas

Am 09.05.2011 03:42, schrieb Ted Dunning:
> In this notation, k is assumed to be a matrix.  k_11 is the element in the
> first row and first column.
>
> I used k to sound like count.
>
> The notation that you quote is R syntax.  rowSums is a function that
> computes the row-wise sums of the argument k.  H is a function defined
> elsewhere.
>
> On Sun, May 8, 2011 at 6:33 PM, Thomas Söhngen<th...@beluto.com>  wrote:
>
>> Thank you for the blog post and showing me the G-test formula.
>>
>> After going through your blog post, I still have some open questions: You
>> introduce k_11 to k_22, but I don't understand what "k" itself actually
>> stands for in your formular and how the sums are defined: LLR = 2 sum(k)
>> (H(k) - H(rowSums(k)) - H(colSums(k)))
>>
>> Am 09.05.2011 02:46, schrieb Ted Dunning:
>>
>>   My guess is that the OP was asking about the generalized log-likelihood
>>> ratio test used in the Mahout recommendation framework.
>>>
>>> That is a bit different from what you describe in that it is the log of
>>> the
>>> ratio of two maximum likelihoods.
>>>
>>> See http://en.wikipedia.org/wiki/G-test for a definition of the test used
>>> in
>>> Mahout.
>>>
>>> On Sun, May 8, 2011 at 5:43 PM, Jeremy Lewi<je...@lewi.us>   wrote:
>>>
>>>   Thomas,
>>>> Are you asking a general question about log-likelihood or a specific
>>>> implementation usage in Mahout?
>>>>
>>>> In general the likelihood is just a number, between 0 and 1 which
>>>> measures the probability of observing some data under some distribution.
>>>>
>>>>
>>>>

Re: Understanding log-likelihood

Posted by Ted Dunning <te...@gmail.com>.
In this notation, k is assumed to be a matrix.  k_11 is the element in the
first row and first column.

I used k to sound like count.

The notation that you quote is R syntax.  rowSums is a function that
computes the row-wise sums of the argument k.  H is a function defined
elsewhere.

On Sun, May 8, 2011 at 6:33 PM, Thomas Söhngen <th...@beluto.com> wrote:

> Thank you for the blog post and showing me the G-test formula.
>
> After going through your blog post, I still have some open questions: You
> introduce k_11 to k_22, but I don't understand what "k" itself actually
> stands for in your formular and how the sums are defined: LLR = 2 sum(k)
> (H(k) - H(rowSums(k)) - H(colSums(k)))
>
> Am 09.05.2011 02:46, schrieb Ted Dunning:
>
>  My guess is that the OP was asking about the generalized log-likelihood
>> ratio test used in the Mahout recommendation framework.
>>
>> That is a bit different from what you describe in that it is the log of
>> the
>> ratio of two maximum likelihoods.
>>
>> See http://en.wikipedia.org/wiki/G-test for a definition of the test used
>> in
>> Mahout.
>>
>> On Sun, May 8, 2011 at 5:43 PM, Jeremy Lewi<je...@lewi.us>  wrote:
>>
>>  Thomas,
>>>
>>> Are you asking a general question about log-likelihood or a specific
>>> implementation usage in Mahout?
>>>
>>> In general the likelihood is just a number, between 0 and 1 which
>>> measures the probability of observing some data under some distribution.
>>>
>>>
>>>

Re: Understanding log-likelihood

Posted by Thomas Söhngen <th...@beluto.com>.
Thank you for the blog post and showing me the G-test formula.

After going through your blog post, I still have some open questions: 
You introduce k_11 to k_22, but I don't understand what "k" itself 
actually stands for in your formular and how the sums are defined: LLR = 
2 sum(k) (H(k) - H(rowSums(k)) - H(colSums(k)))

Am 09.05.2011 02:46, schrieb Ted Dunning:
> My guess is that the OP was asking about the generalized log-likelihood
> ratio test used in the Mahout recommendation framework.
>
> That is a bit different from what you describe in that it is the log of the
> ratio of two maximum likelihoods.
>
> See http://en.wikipedia.org/wiki/G-test for a definition of the test used in
> Mahout.
>
> On Sun, May 8, 2011 at 5:43 PM, Jeremy Lewi<je...@lewi.us>  wrote:
>
>> Thomas,
>>
>> Are you asking a general question about log-likelihood or a specific
>> implementation usage in Mahout?
>>
>> In general the likelihood is just a number, between 0 and 1 which
>> measures the probability of observing some data under some distribution.
>>
>>

Re: Understanding log-likelihood

Posted by Jeremy Lewi <je...@lewi.us>.
I figured it was probably more mahout specific and not general stats.

J
On Sun, 2011-05-08 at 17:46 -0700, Ted Dunning wrote:
> My guess is that the OP was asking about the generalized log-likelihood
> ratio test used in the Mahout recommendation framework.
> 
> That is a bit different from what you describe in that it is the log of the
> ratio of two maximum likelihoods.
> 
> See http://en.wikipedia.org/wiki/G-test for a definition of the test used in
> Mahout.
> 
> On Sun, May 8, 2011 at 5:43 PM, Jeremy Lewi <je...@lewi.us> wrote:
> 
> > Thomas,
> >
> > Are you asking a general question about log-likelihood or a specific
> > implementation usage in Mahout?
> >
> > In general the likelihood is just a number, between 0 and 1 which
> > measures the probability of observing some data under some distribution.
> >
> >


Re: Understanding log-likelihood

Posted by Ted Dunning <te...@gmail.com>.
My guess is that the OP was asking about the generalized log-likelihood
ratio test used in the Mahout recommendation framework.

That is a bit different from what you describe in that it is the log of the
ratio of two maximum likelihoods.

See http://en.wikipedia.org/wiki/G-test for a definition of the test used in
Mahout.

On Sun, May 8, 2011 at 5:43 PM, Jeremy Lewi <je...@lewi.us> wrote:

> Thomas,
>
> Are you asking a general question about log-likelihood or a specific
> implementation usage in Mahout?
>
> In general the likelihood is just a number, between 0 and 1 which
> measures the probability of observing some data under some distribution.
>
>

Re: Understanding log-likelihood

Posted by Jeremy Lewi <je...@lewi.us>.
Thomas,

Are you asking a general question about log-likelihood or a specific
implementation usage in Mahout?

In general the likelihood is just a number, between 0 and 1 which
measures the probability of observing some data under some distribution.

So as a simple example, consider a coin toss. So the probability of
observing a heads is .5 and the probability of observing a tails is .5.

So now suppose you observe a toin coss, and the outcome is a heads. So
now we can ask how likely was this outcome under the assumption that
coin was fair. Well the likelihood in this case is just .5; because the
coin is fair.

So now suppose you observe two coin tosses, and the outcome is both
heads. How likely is this outcome? Since the tosses are independent, the
probability of getting two heads is simply the product of getting a
heads on both flips; so 
p(two heads| fair coin) = .5 *.5 = .25

We can also think of this as a counting problem and consider all
possible outcomes for flipping two coins. These are {HH,TH,HT,TT}.
Since the coin is fair all of these outcomes have equal probability of
1/4 or .25.

The probabilities we computed above are the likelihoods. The
log-likelihood is just the result of taking the log of them.

This is a pretty meager explanation so feel free to ask for
clarification.

J


On Mon, 2011-05-09 at 02:14 +0200, Thomas Söhngen wrote:
> Hello,
> 
> I struggle to understand the log-likelihood function. I would highly 
> welcome a simple example of how it is calculated, especially in Mahout.
> 
> Thanks in advance,
> Thomas