You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@predictionio.apache.org by Pat Ferrel <pa...@occamsmachete.com> on 2016/12/01 17:31:24 UTC

Re: Tuning of Recommendation Engine

Exactly so. The weighting of events is done by the algorithm. To add biases would very likely be wrong and result in worse results. It is therefore not supported in the current code. There may be a place for this type of bias but it would have to be done in conjunction with a cross-validation tests we have in our MAP test suite and it is not yet supported. Best to leave them with the default weighting in the CCO algorithm, which is based on the strength of correlation with the conversion event, which I guess is purchase in your case.


On Nov 28, 2016, at 2:19 PM, Magnus Kragelund <ma...@ida.dk> wrote:

Hi,
It's my understanding that you cannot apply a bias to the event, such as "view" or "purchase" at query time. How the engine is using your different events to calculate score, is something that is in part defined by you and in part defined during training.
In the engine.json config file you set an array of event names. The first event in the array is considered a primary event, and will be the event that the engine is trying to predict. The other events that you might specify is secondary events, that the engine is allowed to take in to consideration, when finding correlations to the primary event in your data set. If no correlation is found for a given event, the event data is not taken into account when predicting results. 

Your array might look like this, when predicting purchases: ["purchase",  "initiated_payment", "view", "preview"]

If you use the special $set event to add metadata to your items, you can apply a bias or filter on those metadata properties at query time.

/magnus

From: Harsh Mathur <ha...@gmail.com>
Sent: Monday, November 28, 2016 3:46:46 PM
To: user@predictionio.incubator.apache.org
Subject: Tuning of Recommendation Engine
 
Hi,
I have successfully deployed the UR template.

Now I wanted to tune it a little bit, As of now I am sending 4 events, purchase, view, initiated_payment and preview. Also our products have categories, I am setting that as item properties.

Now, as I query say:
{
"item": "{item_id}",
"fields": [
{
"name": "view",
"bias": 0.5
},
{
"name": "preview",
"bias": 5
},
{
"name": "purchase",
"bias": 20
}
]
}

and query 
{
        "item": "{item_id}"
}


For both queries, I get the same number of recommendations just the score varies. The boosting isn't changing any recommendations, just changing the scores. Is there any way in UR that we can give more preference to some events, it will help give us more room to try and see and make the recommendations more relevant to us.

Regards
Harsh Mathur
harshmathur.1990@gmail.com <ma...@gmail.com>

“Perseverance is the hard work you do after you get tired of doing the hard work you already did."

Re: Tuning of Recommendation Engine

Posted by Pat Ferrel <pa...@occamsmachete.com>.

No, we find the value of quantile LLR thresholds and use those thresholds to calculate MAP. Then we look at MAP&* number-of-people-that-get-recs to see if there is a max. This is basically and analysis of precision vs recall. MAP will often increase monotonically with higher thresholds until you get no recommendations at all. Hyper-parameter search via trial and error.


On Dec 4, 2016, at 9:15 PM, Gustavo Frederico <gu...@thinkwrap.com> wrote:

Pat, is this tool that finds the optimal LLR thresholds using MAP@K? Did you model it as a regression problem? 

Thanks

Gustavo


On Thu, Dec 1, 2016 at 2:48 PM, Pat Ferrel <pat@occamsmachete.com <ma...@occamsmachete.com>> wrote:
This is a very odd statement. How many tuning knobs do you have with MLlib’s ALS, 1, 2? There are a large number of tuning knobs for the UR to fit different situations. What other recommender allows multiple events as input? The UR also has business rules in the form of filters and boosts on item properties. I think you may have missed a lot in the docs, check some of the most important tuning here: http://actionml.com/docs/ur_advanced_tuning <http://actionml.com/docs/ur_advanced_tuning> and the config params for business rule here: http://actionml.com/docs/ur_config <http://actionml.com/docs/ur_config> 

But changes must be based on either A/B tests or cross-validation. Guessing at tuning is dangerous, intuition about how a big-data algorithm works takes a long time to develop and the trade-offs may do your business harm.

We have a tool that find optimal LLR thresholds based on predictive strength and sets the threshold per event pair. While you can set these by hand the pattern we follow is called hyper-parameter search, which finds optimal tuning for you. 


On Dec 1, 2016, at 11:17 AM, Harsh Mathur <harshmathur.1990@gmail.com <ma...@gmail.com>> wrote:

Hi Pat,
I really appreciate the product, but our team was discussing about how little control we have here.
As in, say some recommendations got delivered to the user and we are tracking conversions of course, so we can know if it's working or not. Now, say if we see that conversions are low, as a developer I have very little to experiment with here. I don't mean any disrespect. I have gone through the code and have put in efforts to understand it too,  the UR is still better than the explicit or implicit templates as it has filtration for properties, only thing lacking in my opinion is the weightages.

I read your ppt 
Recommendations = PtP +PtV+...
We were wondering if it could be
Recommendations = a * PtP + b * PtV+ ...

Where a and b are constants for tuning. In my understanding PtP is a matrix so scalar multiplication should have be possible. Please correct me if I am wrong.

Also I was reading about log likelihood method, but I couldn't find a proper explanation. I would be happy if anyone here can explain it in more detail. Thanks in advance.

Here is what I understood.
For every item-item pair per expression (PtP, PtV), to calculate a score, it will find 4 things:
1. No of users who posted both events for the pair,
2. No of users who posted event for one but not the other and vice versa,
3. No of users who posted for neither

Then a formula is applied taking the 4 params as input and a score is returned.

For each item and event pair you are storing top 20 items according to score in elastic search. I didn't understand why the 2nd and third parameters are taken, also if anyone can explain the correctness of the method, That is why does it work rather how it works?

Regards
Harsh Mathur

On Dec 1, 2016 11:01 PM, "Pat Ferrel" <pat@occamsmachete.com <ma...@occamsmachete.com>> wrote:
Exactly so. The weighting of events is done by the algorithm. To add biases would very likely be wrong and result in worse results. It is therefore not supported in the current code. There may be a place for this type of bias but it would have to be done in conjunction with a cross-validation tests we have in our MAP test suite and it is not yet supported. Best to leave them with the default weighting in the CCO algorithm, which is based on the strength of correlation with the conversion event, which I guess is purchase in your case.


On Nov 28, 2016, at 2:19 PM, Magnus Kragelund <mak@ida.dk <ma...@ida.dk>> wrote:

Hi,
It's my understanding that you cannot apply a bias to the event, such as "view" or "purchase" at query time. How the engine is using your different events to calculate score, is something that is in part defined by you and in part defined during training.
In the engine.json config file you set an array of event names. The first event in the array is considered a primary event, and will be the event that the engine is trying to predict. The other events that you might specify is secondary events, that the engine is allowed to take in to consideration, when finding correlations to the primary event in your data set. If no correlation is found for a given event, the event data is not taken into account when predicting results. 

Your array might look like this, when predicting purchases: ["purchase",  "initiated_payment", "view", "preview"]

If you use the special $set event to add metadata to your items, you can apply a bias or filter on those metadata properties at query time.

/magnus

From: Harsh Mathur <harshmathur.1990@gmail.com <ma...@gmail.com>>
Sent: Monday, November 28, 2016 3:46:46 PM
To: user@predictionio.incubator.apache.org <ma...@predictionio.incubator.apache.org>
Subject: Tuning of Recommendation Engine
 
Hi,
I have successfully deployed the UR template.

Now I wanted to tune it a little bit, As of now I am sending 4 events, purchase, view, initiated_payment and preview. Also our products have categories, I am setting that as item properties.

Now, as I query say:
{
"item": "{item_id}",
"fields": [
{
"name": "view",
"bias": 0.5
},
{
"name": "preview",
"bias": 5
},
{
"name": "purchase",
"bias": 20
}
]
}

and query 
{
        "item": "{item_id}"
}


For both queries, I get the same number of recommendations just the score varies. The boosting isn't changing any recommendations, just changing the scores. Is there any way in UR that we can give more preference to some events, it will help give us more room to try and see and make the recommendations more relevant to us.

Regards
Harsh Mathur
harshmathur.1990@gmail.com <ma...@gmail.com>

“Perseverance is the hard work you do after you get tired of doing the hard work you already did."

Re: Tuning of Recommendation Engine

Posted by Gustavo Frederico <gu...@thinkwrap.com>.

Pat, is this tool that finds the optimal LLR thresholds using MAP@K? Did
you model it as a regression problem?

Thanks

Gustavo


On Thu, Dec 1, 2016 at 2:48 PM, Pat Ferrel <pa...@occamsmachete.com> wrote:

> This is a very odd statement. How many tuning knobs do you have with
> MLlib’s ALS, 1, 2? There are a large number of tuning knobs for the UR to
> fit different situations. What other recommender allows multiple events as
> input? The UR also has business rules in the form of filters and boosts on
> item properties. I think you may have missed a lot in the docs, check some
> of the most important tuning here: http://actionml.com/
> docs/ur_advanced_tuning and the config params for business rule here:
> http://actionml.com/docs/ur_config
>
> But changes must be based on either A/B tests or cross-validation.
> Guessing at tuning is dangerous, intuition about how a big-data algorithm
> works takes a long time to develop and the trade-offs may do your business
> harm.
>
> We have a tool that find optimal LLR thresholds based on predictive
> strength and sets the threshold per event pair. While you can set these by
> hand the pattern we follow is called hyper-parameter search, which finds
> optimal tuning for you.
>
>
> On Dec 1, 2016, at 11:17 AM, Harsh Mathur <ha...@gmail.com>
> wrote:
>
> Hi Pat,
> I really appreciate the product, but our team was discussing about how
> little control we have here.
> As in, say some recommendations got delivered to the user and we are
> tracking conversions of course, so we can know if it's working or not. Now,
> say if we see that conversions are low, as a developer I have very little
> to experiment with here. I don't mean any disrespect. I have gone through
> the code and have put in efforts to understand it too,  the UR is still
> better than the explicit or implicit templates as it has filtration for
> properties, only thing lacking in my opinion is the weightages.
>
> I read your ppt
> Recommendations = PtP +PtV+...
> We were wondering if it could be
> Recommendations = a * PtP + b * PtV+ ...
>
> Where a and b are constants for tuning. In my understanding PtP is a
> matrix so scalar multiplication should have be possible. Please correct me
> if I am wrong.
>
> Also I was reading about log likelihood method, but I couldn't find a
> proper explanation. I would be happy if anyone here can explain it in more
> detail. Thanks in advance.
>
> Here is what I understood.
> For every item-item pair per expression (PtP, PtV), to calculate a score,
> it will find 4 things:
> 1. No of users who posted both events for the pair,
> 2. No of users who posted event for one but not the other and vice versa,
> 3. No of users who posted for neither
>
> Then a formula is applied taking the 4 params as input and a score is
> returned.
>
> For each item and event pair you are storing top 20 items according to
> score in elastic search. I didn't understand why the 2nd and third
> parameters are taken, also if anyone can explain the correctness of the
> method, That is why does it work rather how it works?
>
> Regards
> Harsh Mathur
> On Dec 1, 2016 11:01 PM, "Pat Ferrel" <pa...@occamsmachete.com> wrote:
>
>> Exactly so. The weighting of events is done by the algorithm. To add
>> biases would very likely be wrong and result in worse results. It is
>> therefore not supported in the current code. There may be a place for this
>> type of bias but it would have to be done in conjunction with a
>> cross-validation tests we have in our MAP test suite and it is not yet
>> supported. Best to leave them with the default weighting in the CCO
>> algorithm, which is based on the strength of correlation with the
>> conversion event, which I guess is purchase in your case.
>>
>>
>> On Nov 28, 2016, at 2:19 PM, Magnus Kragelund <ma...@ida.dk> wrote:
>>
>> Hi,
>> It's my understanding that you cannot apply a bias to the event, such as
>> "view" or "purchase" at query time. How the engine is using your different
>> events to calculate score, is something that is in part defined by you and
>> in part defined during training.
>> In the engine.json config file you set an array of event names. The first
>> event in the array is considered a primary event, and will be the event
>> that the engine is trying to predict. The other events that you might
>> specify is secondary events, that the engine is allowed to take in to
>> consideration, when finding correlations to the primary event in your data
>> set. If no correlation is found for a given event, the event data is not
>> taken into account when predicting results.
>>
>> Your array might look like this, when predicting purchases: ["purchase",
>>  "initiated_payment", "view", "preview"]
>>
>> If you use the special $set event to add metadata to your items, you can
>> apply a bias or filter on those metadata properties at query time.
>>
>> /magnus
>>
>> ------------------------------
>> *From:* Harsh Mathur <ha...@gmail.com>
>> *Sent:* Monday, November 28, 2016 3:46:46 PM
>> *To:* user@predictionio.incubator.apache.org
>> *Subject:* Tuning of Recommendation Engine
>>
>> Hi,
>> I have successfully deployed the UR template.
>>
>> Now I wanted to tune it a little bit, As of now I am sending 4 events,
>> purchase, view, initiated_payment and preview. Also our products have
>> categories, I am setting that as item properties.
>>
>> Now, as I query say:
>> {
>> "item": "{item_id}",
>> "fields": [
>> {
>> "name": "view",
>> "bias": 0.5
>> },
>> {
>> "name": "preview",
>> "bias": 5
>> },
>> {
>> "name": "purchase",
>> "bias": 20
>> }
>> ]
>> }
>>
>> and query
>> {
>>         "item": "{item_id}"
>> }
>>
>>
>> For both queries, I get the same number of recommendations just the score
>> varies. The boosting isn't changing any recommendations, just changing the
>> scores. Is there any way in UR that we can give more preference to some
>> events, it will help give us more room to try and see and make the
>> recommendations more relevant to us.
>>
>> Regards
>> Harsh Mathur
>> harshmathur.1990@gmail.com
>>
>> *“Perseverance is the hard work you do after you get tired of doing the
>> hard work you already did."*
>>
>>
>

Re: Tuning of Recommendation Engine

Posted by Pat Ferrel <pa...@occamsmachete.com>.

BTW h_a(AtA) is called cooccurrence and has been widely documented and experimented with. The primary extension we make is cross-occurrence to add new events. Cross validation tests for predictive value of recommendations and A/B tests have been done many times. Mostly these are private so the results are hard to share but I am unaware of anyone doing these that did not stick with the UR. One experiment put the UR up against a big name SaaS recommender company and the UR beat them by a very large margin. The SaaS company had their people tune their algorithm and the UR was not tuned—just using defaults. The UR won this test by about 300%. I hasten to add that this seems a ridiculous margin, much more than most cases could hope for and if the SaaS company hadn’t set it up themselves I’d think there was some major error on their side.

The best method is to choose a candidate or 2 then find a winner or validate your choice with A/B tests, don’t decide on the ability to “tune” alone. The SaaS company may have actually de-tuned to get such abysmal results. They don’t even answer questions about how they tune (or de-tune).

In any case it is free OSS, feel free to use anything you want or modify the source to tune anything your intuition says might work.


On Dec 1, 2016, at 11:48 AM, Pat Ferrel <pa...@occamsmachete.com> wrote:

This is a very odd statement. How many tuning knobs do you have with MLlib’s ALS, 1, 2? There are a large number of tuning knobs for the UR to fit different situations. What other recommender allows multiple events as input? The UR also has business rules in the form of filters and boosts on item properties. I think you may have missed a lot in the docs, check some of the most important tuning here: http://actionml.com/docs/ur_advanced_tuning <http://actionml.com/docs/ur_advanced_tuning> and the config params for business rule here: http://actionml.com/docs/ur_config <http://actionml.com/docs/ur_config> 

But changes must be based on either A/B tests or cross-validation. Guessing at tuning is dangerous, intuition about how a big-data algorithm works takes a long time to develop and the trade-offs may do your business harm.

We have a tool that find optimal LLR thresholds based on predictive strength and sets the threshold per event pair. While you can set these by hand the pattern we follow is called hyper-parameter search, which finds optimal tuning for you. 


On Dec 1, 2016, at 11:17 AM, Harsh Mathur <harshmathur.1990@gmail.com <ma...@gmail.com>> wrote:

Hi Pat,
I really appreciate the product, but our team was discussing about how little control we have here.
As in, say some recommendations got delivered to the user and we are tracking conversions of course, so we can know if it's working or not. Now, say if we see that conversions are low, as a developer I have very little to experiment with here. I don't mean any disrespect. I have gone through the code and have put in efforts to understand it too,  the UR is still better than the explicit or implicit templates as it has filtration for properties, only thing lacking in my opinion is the weightages.

I read your ppt 
Recommendations = PtP +PtV+...
We were wondering if it could be
Recommendations = a * PtP + b * PtV+ ...

Where a and b are constants for tuning. In my understanding PtP is a matrix so scalar multiplication should have be possible. Please correct me if I am wrong.

Also I was reading about log likelihood method, but I couldn't find a proper explanation. I would be happy if anyone here can explain it in more detail. Thanks in advance.

Here is what I understood.
For every item-item pair per expression (PtP, PtV), to calculate a score, it will find 4 things:
1. No of users who posted both events for the pair,
2. No of users who posted event for one but not the other and vice versa,
3. No of users who posted for neither

Then a formula is applied taking the 4 params as input and a score is returned.

For each item and event pair you are storing top 20 items according to score in elastic search. I didn't understand why the 2nd and third parameters are taken, also if anyone can explain the correctness of the method, That is why does it work rather how it works?

Regards
Harsh Mathur

On Dec 1, 2016 11:01 PM, "Pat Ferrel" <pat@occamsmachete.com <ma...@occamsmachete.com>> wrote:
Exactly so. The weighting of events is done by the algorithm. To add biases would very likely be wrong and result in worse results. It is therefore not supported in the current code. There may be a place for this type of bias but it would have to be done in conjunction with a cross-validation tests we have in our MAP test suite and it is not yet supported. Best to leave them with the default weighting in the CCO algorithm, which is based on the strength of correlation with the conversion event, which I guess is purchase in your case.


On Nov 28, 2016, at 2:19 PM, Magnus Kragelund <mak@ida.dk <ma...@ida.dk>> wrote:

Hi,
It's my understanding that you cannot apply a bias to the event, such as "view" or "purchase" at query time. How the engine is using your different events to calculate score, is something that is in part defined by you and in part defined during training.
In the engine.json config file you set an array of event names. The first event in the array is considered a primary event, and will be the event that the engine is trying to predict. The other events that you might specify is secondary events, that the engine is allowed to take in to consideration, when finding correlations to the primary event in your data set. If no correlation is found for a given event, the event data is not taken into account when predicting results. 

Your array might look like this, when predicting purchases: ["purchase",  "initiated_payment", "view", "preview"]

If you use the special $set event to add metadata to your items, you can apply a bias or filter on those metadata properties at query time.

/magnus

From: Harsh Mathur <harshmathur.1990@gmail.com <ma...@gmail.com>>
Sent: Monday, November 28, 2016 3:46:46 PM
To: user@predictionio.incubator.apache.org <ma...@predictionio.incubator.apache.org>
Subject: Tuning of Recommendation Engine
 
Hi,
I have successfully deployed the UR template.

Now I wanted to tune it a little bit, As of now I am sending 4 events, purchase, view, initiated_payment and preview. Also our products have categories, I am setting that as item properties.

Now, as I query say:
{
"item": "{item_id}",
"fields": [
{
"name": "view",
"bias": 0.5
},
{
"name": "preview",
"bias": 5
},
{
"name": "purchase",
"bias": 20
}
]
}

and query 
{
        "item": "{item_id}"
}


For both queries, I get the same number of recommendations just the score varies. The boosting isn't changing any recommendations, just changing the scores. Is there any way in UR that we can give more preference to some events, it will help give us more room to try and see and make the recommendations more relevant to us.

Regards
Harsh Mathur
harshmathur.1990@gmail.com <ma...@gmail.com>

“Perseverance is the hard work you do after you get tired of doing the hard work you already did."

Re: Tuning of Recommendation Engine

Posted by Pat Ferrel <pa...@occamsmachete.com>.

This is a very odd statement. How many tuning knobs do you have with MLlib’s ALS, 1, 2? There are a large number of tuning knobs for the UR to fit different situations. What other recommender allows multiple events as input? The UR also has business rules in the form of filters and boosts on item properties. I think you may have missed a lot in the docs, check some of the most important tuning here: http://actionml.com/docs/ur_advanced_tuning <http://actionml.com/docs/ur_advanced_tuning> and the config params for business rule here: http://actionml.com/docs/ur_config <http://actionml.com/docs/ur_config> 

But changes must be based on either A/B tests or cross-validation. Guessing at tuning is dangerous, intuition about how a big-data algorithm works takes a long time to develop and the trade-offs may do your business harm.

We have a tool that find optimal LLR thresholds based on predictive strength and sets the threshold per event pair. While you can set these by hand the pattern we follow is called hyper-parameter search, which finds optimal tuning for you. 


On Dec 1, 2016, at 11:17 AM, Harsh Mathur <ha...@gmail.com> wrote:

Hi Pat,
I really appreciate the product, but our team was discussing about how little control we have here.
As in, say some recommendations got delivered to the user and we are tracking conversions of course, so we can know if it's working or not. Now, say if we see that conversions are low, as a developer I have very little to experiment with here. I don't mean any disrespect. I have gone through the code and have put in efforts to understand it too,  the UR is still better than the explicit or implicit templates as it has filtration for properties, only thing lacking in my opinion is the weightages.

I read your ppt 
Recommendations = PtP +PtV+...
We were wondering if it could be
Recommendations = a * PtP + b * PtV+ ...

Where a and b are constants for tuning. In my understanding PtP is a matrix so scalar multiplication should have be possible. Please correct me if I am wrong.

Also I was reading about log likelihood method, but I couldn't find a proper explanation. I would be happy if anyone here can explain it in more detail. Thanks in advance.

Here is what I understood.
For every item-item pair per expression (PtP, PtV), to calculate a score, it will find 4 things:
1. No of users who posted both events for the pair,
2. No of users who posted event for one but not the other and vice versa,
3. No of users who posted for neither

Then a formula is applied taking the 4 params as input and a score is returned.

For each item and event pair you are storing top 20 items according to score in elastic search. I didn't understand why the 2nd and third parameters are taken, also if anyone can explain the correctness of the method, That is why does it work rather how it works?

Regards
Harsh Mathur

On Dec 1, 2016 11:01 PM, "Pat Ferrel" <pat@occamsmachete.com <ma...@occamsmachete.com>> wrote:
Exactly so. The weighting of events is done by the algorithm. To add biases would very likely be wrong and result in worse results. It is therefore not supported in the current code. There may be a place for this type of bias but it would have to be done in conjunction with a cross-validation tests we have in our MAP test suite and it is not yet supported. Best to leave them with the default weighting in the CCO algorithm, which is based on the strength of correlation with the conversion event, which I guess is purchase in your case.


On Nov 28, 2016, at 2:19 PM, Magnus Kragelund <mak@ida.dk <ma...@ida.dk>> wrote:

Hi,
It's my understanding that you cannot apply a bias to the event, such as "view" or "purchase" at query time. How the engine is using your different events to calculate score, is something that is in part defined by you and in part defined during training.
In the engine.json config file you set an array of event names. The first event in the array is considered a primary event, and will be the event that the engine is trying to predict. The other events that you might specify is secondary events, that the engine is allowed to take in to consideration, when finding correlations to the primary event in your data set. If no correlation is found for a given event, the event data is not taken into account when predicting results. 

Your array might look like this, when predicting purchases: ["purchase",  "initiated_payment", "view", "preview"]

If you use the special $set event to add metadata to your items, you can apply a bias or filter on those metadata properties at query time.

/magnus

From: Harsh Mathur <harshmathur.1990@gmail.com <ma...@gmail.com>>
Sent: Monday, November 28, 2016 3:46:46 PM
To: user@predictionio.incubator.apache.org <ma...@predictionio.incubator.apache.org>
Subject: Tuning of Recommendation Engine
 
Hi,
I have successfully deployed the UR template.

Now I wanted to tune it a little bit, As of now I am sending 4 events, purchase, view, initiated_payment and preview. Also our products have categories, I am setting that as item properties.

Now, as I query say:
{
"item": "{item_id}",
"fields": [
{
"name": "view",
"bias": 0.5
},
{
"name": "preview",
"bias": 5
},
{
"name": "purchase",
"bias": 20
}
]
}

and query 
{
        "item": "{item_id}"
}


For both queries, I get the same number of recommendations just the score varies. The boosting isn't changing any recommendations, just changing the scores. Is there any way in UR that we can give more preference to some events, it will help give us more room to try and see and make the recommendations more relevant to us.

Regards
Harsh Mathur
harshmathur.1990@gmail.com <ma...@gmail.com>

“Perseverance is the hard work you do after you get tired of doing the hard work you already did."

Re: Tuning of Recommendation Engine

Posted by Harsh Mathur <ha...@gmail.com>.

Hi Pat,
I really appreciate the product, but our team was discussing about how
little control we have here.
As in, say some recommendations got delivered to the user and we are
tracking conversions of course, so we can know if it's working or not. Now,
say if we see that conversions are low, as a developer I have very little
to experiment with here. I don't mean any disrespect. I have gone through
the code and have put in efforts to understand it too,  the UR is still
better than the explicit or implicit templates as it has filtration for
properties, only thing lacking in my opinion is the weightages.

I read your ppt
Recommendations = PtP +PtV+...
We were wondering if it could be
Recommendations = a * PtP + b * PtV+ ...

Where a and b are constants for tuning. In my understanding PtP is a matrix
so scalar multiplication should have be possible. Please correct me if I am
wrong.

Also I was reading about log likelihood method, but I couldn't find a
proper explanation. I would be happy if anyone here can explain it in more
detail. Thanks in advance.

Here is what I understood.
For every item-item pair per expression (PtP, PtV), to calculate a score,
it will find 4 things:
1. No of users who posted both events for the pair,
2. No of users who posted event for one but not the other and vice versa,
3. No of users who posted for neither

Then a formula is applied taking the 4 params as input and a score is
returned.

For each item and event pair you are storing top 20 items according to
score in elastic search. I didn't understand why the 2nd and third
parameters are taken, also if anyone can explain the correctness of the
method, That is why does it work rather how it works?

Regards
Harsh Mathur
On Dec 1, 2016 11:01 PM, "Pat Ferrel" <pa...@occamsmachete.com> wrote:

> Exactly so. The weighting of events is done by the algorithm. To add
> biases would very likely be wrong and result in worse results. It is
> therefore not supported in the current code. There may be a place for this
> type of bias but it would have to be done in conjunction with a
> cross-validation tests we have in our MAP test suite and it is not yet
> supported. Best to leave them with the default weighting in the CCO
> algorithm, which is based on the strength of correlation with the
> conversion event, which I guess is purchase in your case.
>
>
> On Nov 28, 2016, at 2:19 PM, Magnus Kragelund <ma...@ida.dk> wrote:
>
> Hi,
> It's my understanding that you cannot apply a bias to the event, such as
> "view" or "purchase" at query time. How the engine is using your different
> events to calculate score, is something that is in part defined by you and
> in part defined during training.
> In the engine.json config file you set an array of event names. The first
> event in the array is considered a primary event, and will be the event
> that the engine is trying to predict. The other events that you might
> specify is secondary events, that the engine is allowed to take in to
> consideration, when finding correlations to the primary event in your data
> set. If no correlation is found for a given event, the event data is not
> taken into account when predicting results.
>
> Your array might look like this, when predicting purchases: ["purchase",
>  "initiated_payment", "view", "preview"]
>
> If you use the special $set event to add metadata to your items, you can
> apply a bias or filter on those metadata properties at query time.
>
> /magnus
>
> ------------------------------
> *From:* Harsh Mathur <ha...@gmail.com>
> *Sent:* Monday, November 28, 2016 3:46:46 PM
> *To:* user@predictionio.incubator.apache.org
> *Subject:* Tuning of Recommendation Engine
>
> Hi,
> I have successfully deployed the UR template.
>
> Now I wanted to tune it a little bit, As of now I am sending 4 events,
> purchase, view, initiated_payment and preview. Also our products have
> categories, I am setting that as item properties.
>
> Now, as I query say:
> {
> "item": "{item_id}",
> "fields": [
> {
> "name": "view",
> "bias": 0.5
> },
> {
> "name": "preview",
> "bias": 5
> },
> {
> "name": "purchase",
> "bias": 20
> }
> ]
> }
>
> and query
> {
>         "item": "{item_id}"
> }
>
>
> For both queries, I get the same number of recommendations just the score
> varies. The boosting isn't changing any recommendations, just changing the
> scores. Is there any way in UR that we can give more preference to some
> events, it will help give us more room to try and see and make the
> recommendations more relevant to us.
>
> Regards
> Harsh Mathur
> harshmathur.1990@gmail.com
>
> *“Perseverance is the hard work you do after you get tired of doing the
> hard work you already did."*
>
>