You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@predictionio.apache.org by George Yarish <gy...@griddynamics.com> on 2018/06/20 15:34:10 UTC

UR trending ranking as separate process

Hi!

Not sure this is correct place to ask, since my question correspond to UR
specifically, not to pio itself I guess.

Anyway, we are using UR template for predictionio and we are about to use
trending ranking for sorting UR output. If I understand it correctly
ranking is created during training and stored in ES. Our training takes ~ 3
hours and we launch it daily by scheduler but for trending rankings we want
to get actual information every 30 minutes.

That means we want to separate training (scores calculation) and ranking
calculation and launch them by different schedule.

Is there any easy way to achieve it? Does UR supports something like this?

Thanks,
George

Re: UR trending ranking as separate process

Posted by Sami Serbey <sa...@designer-24.com>.

Hi George,

I didn't get your question but I think I am missing something. So you're using the Universal Recommender and you're getting a sorted output based on the trending items? Is that really a thing in this template? May I please know how can you configure the template to get such output? I really hope you can answer that. I am also working with the UR template.

Regards,
Sami Serbey

Get Outlook for iOS<https://aka.ms/o0ukef>
________________________________
From: George Yarish <gy...@griddynamics.com>
Sent: Wednesday, June 20, 2018 7:45:12 PM
To: Pat Ferrel
Cc: user@predictionio.apache.org
Subject: Re: UR trending ranking as separate process

Matthew, Pat

Thanks for the answers and concerns. Yes, we want to calculate every 30 minutes trending for the last X hours, there X might be even few days. So realtime analogy is correct.

On Wed, Jun 20, 2018 at 6:50 PM, Pat Ferrel <pa...@occamsmachete.com>> wrote:
No the trending algorithm is meant to look at something like trends over 2 days. This is because it looks at 2 buckets of conversion frequencies and if you cut them smaller than a day you will have so much bias due to daily variations that the trends will be invalid. In other words the ups and downs over a day period need to be made irrelevant and taking day long buckets is the simplest way to do this. Likewise for “hot” which needs 3 buckets and so takes 3 days worth of data.

Maybe what you need is to just count conversions for 30 minutes as a realtime thing. For every item, keep conversions for the last 30 minutes, sort them periodically by count. This is a Kappa style algorithm doing online learning, not really supported by PredictionIO. You will have to experiment with the length of time since a too small period will be very noisy, popping back and forth between items semi-randomly.


From: George Yarish <gy...@griddynamics.com>
Reply: user@predictionio.apache.org<ma...@predictionio.apache.org> <us...@predictionio.apache.org>
Date: June 20, 2018 at 8:34:10 AM
To: user@predictionio.apache.org<ma...@predictionio.apache.org> <us...@predictionio.apache.org>
Subject:  UR trending ranking as separate process

Hi!

Not sure this is correct place to ask, since my question correspond to UR specifically, not to pio itself I guess.

Anyway, we are using UR template for predictionio and we are about to use trending ranking for sorting UR output. If I understand it correctly ranking is created during training and stored in ES. Our training takes ~ 3 hours and we launch it daily by scheduler but for trending rankings we want to get actual information every 30 minutes.

That means we want to separate training (scores calculation) and ranking calculation and launch them by different schedule.

Is there any easy way to achieve it? Does UR supports something like this?

Thanks,
George



--



[https://lh5.googleusercontent.com/BCmPaSadiZ-XB599M0QVCfksI796cpOJNaNrkphsDTZybycTLg2EjWUeoZKcKG1qjO-BMMxbkMfKKjCOX_4s0lrSRKN5bnHlXKaKUtUe7mgxzs4K89HKw6UAVNgBY0sr_Y7CVQZw]

George Yarish, Java Developer

Grid Dynamics

197101, Rentgena Str., 5A, St.Petersburg, Russia

Cell: +7 950 030-1941

Read Grid Dynamics' Tech Blog<http://blog.griddynamics.com/?utm_campaign=Big%20Data%20Blog%20social%20media%20promotion&utm_medium=CTA&utm_source=Email>

RE: UR trending ranking as separate process

Posted by "WILLIAMS, PAUL H" <pw...@att.com>.

Unsubscribe from list please.

From: Pat Ferrel <pa...@occamsmachete.com>
Sent: Wednesday, June 20, 2018 1:34 PM
To: user@predictionio.apache.org; Sami Serbey <sa...@designer-24.com>
Cc: user@predictionio.apache.org; actionml-user <ac...@googlegroups.com>
Subject: Re: UR trending ranking as separate process

Yes, we support “popular”, “trending”, and “hot” as methods for ranking items. The UR queries are backfilled with these items if there are not enough results. So if the users has little history and so only gets 5 out of 10 results based on this history, we will automatically return the other 5 from the “popular” results. This is the default, if there is no specific config for this.

If you query with no user or item, we will return only from “popular” or whatever brand of ranking you have setup.

To change which type of ranking you want you can specify the period to use in calculating the ranking and which method from “popular”, “trending”, and “hot”. These roughly correspond to # of conversion, speed of conversion, and acceleration in conversions, if that helps.

Docs here: http://actionml.com/docs/ur_config<https://urldefense.proofpoint.com/v2/url?u=http-3A__actionml.com_docs_ur-5Fconfig&d=DwMFaQ&c=LFYZ-o9_HUMeMTSQicvjIg&r=p7zJ7hYtx54G5U7d6pFmiw&m=ipBDsJMdPpxekUOU1pzbQDmKefFK2k6-YCVBq-xdTyk&s=BRlWp57KVorj1fUtQ12DUpnAp52WE4gKCW6YpiZeOiA&e=> Search for “rankings"

From: Sami Serbey <sa...@designer-24.com>
Reply: user@predictionio.apache.org <us...@predictionio.apache.org>
Date: June 20, 2018 at 10:25:53 AM
To: user@predictionio.apache.org <us...@predictionio.apache.org>, Pat Ferrel <pa...@occamsmachete.com>
Cc: user@predictionio.apache.org <us...@predictionio.apache.org>
Subject:  Re: UR trending ranking as separate process

Hi George,

I didn't get your question but I think I am missing something. So you're using the Universal Recommender and you're getting a sorted output based on the trending items? Is that really a thing in this template? May I please know how can you configure the template to get such output? I really hope you can answer that. I am also working with the UR template.

Regards,
Sami Serbey

Get Outlook for iOS<https://urldefense.proofpoint.com/v2/url?u=https-3A__aka.ms_o0ukef&d=DwMFaQ&c=LFYZ-o9_HUMeMTSQicvjIg&r=p7zJ7hYtx54G5U7d6pFmiw&m=ipBDsJMdPpxekUOU1pzbQDmKefFK2k6-YCVBq-xdTyk&s=Q3q2yhAA96JwR2bfaJ89yWOG0PW2iMv1T5UtYfWJ_zU&e=>
________________________________
From: George Yarish <gy...@griddynamics.com>
Sent: Wednesday, June 20, 2018 7:45:12 PM
To: Pat Ferrel
Cc: user@predictionio.apache.org
Subject: Re: UR trending ranking as separate process

Matthew, Pat

Thanks for the answers and concerns. Yes, we want to calculate every 30 minutes trending for the last X hours, there X might be even few days. So realtime analogy is correct.

On Wed, Jun 20, 2018 at 6:50 PM, Pat Ferrel <pa...@occamsmachete.com>> wrote:
No the trending algorithm is meant to look at something like trends over 2 days. This is because it looks at 2 buckets of conversion frequencies and if you cut them smaller than a day you will have so much bias due to daily variations that the trends will be invalid. In other words the ups and downs over a day period need to be made irrelevant and taking day long buckets is the simplest way to do this. Likewise for “hot” which needs 3 buckets and so takes 3 days worth of data.

Maybe what you need is to just count conversions for 30 minutes as a realtime thing. For every item, keep conversions for the last 30 minutes, sort them periodically by count. This is a Kappa style algorithm doing online learning, not really supported by PredictionIO. You will have to experiment with the length of time since a too small period will be very noisy, popping back and forth between items semi-randomly.

From: George Yarish <gy...@griddynamics.com>
Reply: user@predictionio.apache.org<ma...@predictionio.apache.org> <us...@predictionio.apache.org>
Date: June 20, 2018 at 8:34:10 AM
To: user@predictionio.apache.org<ma...@predictionio.apache.org> <us...@predictionio.apache.org>
Subject:  UR trending ranking as separate process

Hi!

Not sure this is correct place to ask, since my question correspond to UR specifically, not to pio itself I guess.

Anyway, we are using UR template for predictionio and we are about to use trending ranking for sorting UR output. If I understand it correctly ranking is created during training and stored in ES. Our training takes ~ 3 hours and we launch it daily by scheduler but for trending rankings we want to get actual information every 30 minutes.

That means we want to separate training (scores calculation) and ranking calculation and launch them by different schedule.

Is there any easy way to achieve it? Does UR supports something like this?

Thanks,
George

--

[Image removed by sender.]

George Yarish, Java Developer

Grid Dynamics

197101, Rentgena Str., 5A, St.Petersburg, Russia

Cell: +7 950 030-1941

Read Grid Dynamics' Tech Blog<https://urldefense.proofpoint.com/v2/url?u=http-3A__blog.griddynamics.com_-3Futm-5Fcampaign-3DBig-2520Data-2520Blog-2520social-2520media-2520promotion-26utm-5Fmedium-3DCTA-26utm-5Fsource-3DEmail&d=DwMFaQ&c=LFYZ-o9_HUMeMTSQicvjIg&r=p7zJ7hYtx54G5U7d6pFmiw&m=ipBDsJMdPpxekUOU1pzbQDmKefFK2k6-YCVBq-xdTyk&s=TT7axc6H7GsljX1P6M3XclgO2VW8rf1MjV1KrJUPWkQ&e=>

Re: UR trending ranking as separate process

Posted by Pat Ferrel <pa...@occamsmachete.com>.

Yes, we support “popular”, “trending”, and “hot” as methods for ranking items. The UR queries are backfilled with these items if there are not enough results. So if the users has little history and so only gets 5 out of 10 results based on this history, we will automatically return the other 5 from the “popular” results. This is the default, if there is no specific config for this.

If you query with no user or item, we will return only from “popular” or whatever brand of ranking you have setup.

To change which type of ranking you want you can specify the period to use in calculating the ranking and which method from “popular”, “trending”, and “hot”. These roughly correspond to # of conversion, speed of conversion, and acceleration in conversions, if that helps.

Docs here: http://actionml.com/docs/ur_config Search for “rankings" 


From: Sami Serbey <sa...@designer-24.com>
Reply: user@predictionio.apache.org <us...@predictionio.apache.org>
Date: June 20, 2018 at 10:25:53 AM
To: user@predictionio.apache.org <us...@predictionio.apache.org>, Pat Ferrel <pa...@occamsmachete.com>
Cc: user@predictionio.apache.org <us...@predictionio.apache.org>
Subject:  Re: UR trending ranking as separate process  

Hi George,

I didn't get your question but I think I am missing something. So you're using the Universal Recommender and you're getting a sorted output based on the trending items? Is that really a thing in this template? May I please know how can you configure the template to get such output? I really hope you can answer that. I am also working with the UR template.

Regards,
Sami Serbey

Get Outlook for iOS
From: George Yarish <gy...@griddynamics.com>
Sent: Wednesday, June 20, 2018 7:45:12 PM
To: Pat Ferrel
Cc: user@predictionio.apache.org
Subject: Re: UR trending ranking as separate process
 
Matthew, Pat

Thanks for the answers and concerns. Yes, we want to calculate every 30 minutes trending for the last X hours, there X might be even few days. So realtime analogy is correct. 

On Wed, Jun 20, 2018 at 6:50 PM, Pat Ferrel <pa...@occamsmachete.com> wrote:
No the trending algorithm is meant to look at something like trends over 2 days. This is because it looks at 2 buckets of conversion frequencies and if you cut them smaller than a day you will have so much bias due to daily variations that the trends will be invalid. In other words the ups and downs over a day period need to be made irrelevant and taking day long buckets is the simplest way to do this. Likewise for “hot” which needs 3 buckets and so takes 3 days worth of data. 

Maybe what you need is to just count conversions for 30 minutes as a realtime thing. For every item, keep conversions for the last 30 minutes, sort them periodically by count. This is a Kappa style algorithm doing online learning, not really supported by PredictionIO. You will have to experiment with the length of time since a too small period will be very noisy, popping back and forth between items semi-randomly.


From: George Yarish <gy...@griddynamics.com>
Reply: user@predictionio.apache.org <us...@predictionio.apache.org>
Date: June 20, 2018 at 8:34:10 AM
To: user@predictionio.apache.org <us...@predictionio.apache.org>
Subject:  UR trending ranking as separate process 

Hi!

Not sure this is correct place to ask, since my question correspond to UR specifically, not to pio itself I guess. 

Anyway, we are using UR template for predictionio and we are about to use trending ranking for sorting UR output. If I understand it correctly ranking is created during training and stored in ES. Our training takes ~ 3 hours and we launch it daily by scheduler but for trending rankings we want to get actual information every 30 minutes.

That means we want to separate training (scores calculation) and ranking calculation and launch them by different schedule.

Is there any easy way to achieve it? Does UR supports something like this?

Thanks,
George



-- 






George Yarish, Java Developer


Grid Dynamics


197101, Rentgena Str., 5A, St.Petersburg, Russia

Cell: +7 950 030-1941


Read Grid Dynamics' Tech Blog

Re: UR trending ranking as separate process

Posted by George Yarish <gy...@griddynamics.com>.

Matthew, Pat

Thanks for the answers and concerns. Yes, we want to calculate every 30
minutes trending for the last X hours, there X might be even few days. So
realtime analogy is correct.

On Wed, Jun 20, 2018 at 6:50 PM, Pat Ferrel <pa...@occamsmachete.com> wrote:

> No the trending algorithm is meant to look at something like trends over 2
> days. This is because it looks at 2 buckets of conversion frequencies and
> if you cut them smaller than a day you will have so much bias due to daily
> variations that the trends will be invalid. In other words the ups and
> downs over a day period need to be made irrelevant and taking day long
> buckets is the simplest way to do this. Likewise for “hot” which needs 3
> buckets and so takes 3 days worth of data.
>
> Maybe what you need is to just count conversions for 30 minutes as a
> realtime thing. For every item, keep conversions for the last 30 minutes,
> sort them periodically by count. This is a Kappa style algorithm doing
> online learning, not really supported by PredictionIO. You will have to
> experiment with the length of time since a too small period will be very
> noisy, popping back and forth between items semi-randomly.
>
>
> From: George Yarish <gy...@griddynamics.com> <gy...@griddynamics.com>
> Reply: user@predictionio.apache.org <us...@predictionio.apache.org>
> <us...@predictionio.apache.org>
> Date: June 20, 2018 at 8:34:10 AM
> To: user@predictionio.apache.org <us...@predictionio.apache.org>
> <us...@predictionio.apache.org>
> Subject:  UR trending ranking as separate process
>
> Hi!
>
> Not sure this is correct place to ask, since my question correspond to UR
> specifically, not to pio itself I guess.
>
> Anyway, we are using UR template for predictionio and we are about to use
> trending ranking for sorting UR output. If I understand it correctly
> ranking is created during training and stored in ES. Our training takes ~ 3
> hours and we launch it daily by scheduler but for trending rankings we want
> to get actual information every 30 minutes.
>
> That means we want to separate training (scores calculation) and ranking
> calculation and launch them by different schedule.
>
> Is there any easy way to achieve it? Does UR supports something like this?
>
> Thanks,
> George
>
>


-- 

George Yarish, Java Developer

Grid Dynamics

197101, Rentgena Str., 5A, St.Petersburg, Russia

Cell: +7 950 030-1941

Read Grid Dynamics' Tech Blog
<http://blog.griddynamics.com/?utm_campaign=Big%20Data%20Blog%20social%20media%20promotion&utm_medium=CTA&utm_source=Email>

Re: UR trending ranking as separate process

Posted by Pat Ferrel <pa...@occamsmachete.com>.

No the trending algorithm is meant to look at something like trends over 2
days. This is because it looks at 2 buckets of conversion frequencies and
if you cut them smaller than a day you will have so much bias due to daily
variations that the trends will be invalid. In other words the ups and
downs over a day period need to be made irrelevant and taking day long
buckets is the simplest way to do this. Likewise for “hot” which needs 3
buckets and so takes 3 days worth of data.

Maybe what you need is to just count conversions for 30 minutes as a
realtime thing. For every item, keep conversions for the last 30 minutes,
sort them periodically by count. This is a Kappa style algorithm doing
online learning, not really supported by PredictionIO. You will have to
experiment with the length of time since a too small period will be very
noisy, popping back and forth between items semi-randomly.


From: George Yarish <gy...@griddynamics.com> <gy...@griddynamics.com>
Reply: user@predictionio.apache.org <us...@predictionio.apache.org>
<us...@predictionio.apache.org>
Date: June 20, 2018 at 8:34:10 AM
To: user@predictionio.apache.org <us...@predictionio.apache.org>
<us...@predictionio.apache.org>
Subject:  UR trending ranking as separate process

Hi!

Not sure this is correct place to ask, since my question correspond to UR
specifically, not to pio itself I guess.

Anyway, we are using UR template for predictionio and we are about to use
trending ranking for sorting UR output. If I understand it correctly
ranking is created during training and stored in ES. Our training takes ~ 3
hours and we launch it daily by scheduler but for trending rankings we want
to get actual information every 30 minutes.

That means we want to separate training (scores calculation) and ranking
calculation and launch them by different schedule.

Is there any easy way to achieve it? Does UR supports something like this?

Thanks,
George

Re: UR trending ranking as separate process

Posted by KRISH MEHTA <kr...@gmail.com>.

Hey George,
If possible, can you let me know the size of the data and system configuration which is taking around 3 hours to train? I am working on a recommendation system but I want something to recommend in real-time like really quick (around 1-5 minutes).

Thanks, 
Krish

> On Jun 20, 2018, at 8:34 AM, George Yarish <gy...@griddynamics.com> wrote:
> 
> Hi!
> 
> Not sure this is correct place to ask, since my question correspond to UR specifically, not to pio itself I guess. 
> 
> Anyway, we are using UR template for predictionio and we are about to use trending ranking for sorting UR output. If I understand it correctly ranking is created during training and stored in ES. Our training takes ~ 3 hours and we launch it daily by scheduler but for trending rankings we want to get actual information every 30 minutes.
> 
> That means we want to separate training (scores calculation) and ranking calculation and launch them by different schedule.
> 
> Is there any easy way to achieve it? Does UR supports something like this?
> 
> Thanks,
> George

Re: UR trending ranking as separate process

Posted by Matthew Peychich <mp...@me.com>.

If I am understanding the question correctly, you want to update backfill data more frequently than a full model train (the CCO calculations). To do this you can swap out your engine.json file and specify the engine to only calculate the backfill data. This will update the existing model rather than replace it. Details can be found in ActionML’s Advanced Tuning documentation (http://actionml.com/docs/ur_advanced_tuning <http://actionml.com/docs/ur_advanced_tuning>) under “Update the Popularity Model Only”.

— Matthew Peychich

> On Jun 20, 2018, at 8:34 AM, George Yarish <gy...@griddynamics.com> wrote:
> 
> Hi!
> 
> Not sure this is correct place to ask, since my question correspond to UR specifically, not to pio itself I guess. 
> 
> Anyway, we are using UR template for predictionio and we are about to use trending ranking for sorting UR output. If I understand it correctly ranking is created during training and stored in ES. Our training takes ~ 3 hours and we launch it daily by scheduler but for trending rankings we want to get actual information every 30 minutes.
> 
> That means we want to separate training (scores calculation) and ranking calculation and launch them by different schedule.
> 
> Is there any easy way to achieve it? Does UR supports something like this?
> 
> Thanks,
> George