You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@predictionio.apache.org by Dennis Honders <de...@gmail.com> on 2017/05/16 17:19:18 UTC

Data UR

Hi,

1. 
I already used similar product template for experimenting. 
https://predictionio.incubator.apache.org/templates/similarproduct/quickstart/

For UR, are the data queries for the eventserver about the same, but can take more properties? In my case three events. Set users, set items and set buys. 

2. 
I have coordinates for the users. Is this supported as property?

Note: in my case I like to make predictions by user id and by an array of item ids which is supported, also for products that are never bought for cold start. I have item properties like category id, manufacturer id, label and price range. 

Thanks in advance

Re: Data UR

Posted by Pat Ferrel <pa...@occamsmachete.com>.
The docs should be fairly clear. If not please suggest or PR any changes you find: http://actionml.com/docs/ur_input#property-event <http://actionml.com/docs/ur_input#property-event>

Categorical properties must be arrays of strings even if there is only one string. They must all be encoded like “category” below

Yes timestamps can be applied to any event and don’t play a role in items-set recs.


On May 17, 2017, at 8:34 AM, Dennis Honders <de...@gmail.com> wrote:

So it will be event $set?
{
  "event" : "$set",
  "entityType" : "item",
  "entityId" : "31",
  "properties" : {
    category: ["5", "8"], 
	manufacturer: "55", 
	label: "test-item",
	price: "$1-$5"
  }
}

And the eventdate can be applied to event cart-transaction?



2017-05-17 17:13 GMT+02:00 Pat Ferrel <pat@occamsmachete.com <ma...@occamsmachete.com>>:
With multiple event types there appears to be a serious bug that kills training. I don’t advise RC1 at present. If you don’t get the error, it should work correctly but if you do, you’ll have to wait for a fix.

Option 2 is correct, properties are applied to the entityId, which would be the cart. Properties for items are only set with $set events and only used in filters and boosts. $set allows you to set them once, if we read the usage events they would be set with every event.


On May 17, 2017, at 8:07 AM, Dennis Honders <dennishonders@gmail.com <ma...@gmail.com>> wrote:

I have the UR 0.6.0 (develop) installed. 
I only have one order that has more than 50 items, so I can easily exclude that one. 

The trainingsdata will be option 1 or 2? 

Sampledata: 
order-id: 17
items: 31, 32, 33
 
Option 1

 [
    {
        event: “cart-transaction"
        entityId: 17
        entityType: “user"
        targetEntityId: 31
        targetEntityType: “item”, 
		properties: {
			category: ["5", "8"], 
			manufacturer: "55", 
			label: "test-item",
			price: "$1-$5"
		}, 
		eventTime: "2015-10-05T21:02:49.228Z"
    },{
        event: “cart-transaction"
        entityId: 17
        entityType: “user"
        targetEntityId: 32
        targetEntityType: “item”, 
		properties: {
			category: ["15", "18"], 
			manufacturer: "66", 
			label: "test-item",
			price: "$1-$5"
		}, 
		eventTime: "2015-10-05T21:02:49.228Z"
    },{
        event: “cart-transaction"
        entityId: 17
        entityType: “user"
        targetEntityId: 33
        targetEntityType: “item”, 
		properties: {
			category: ["25", "28"], 
			manufacturer: "77", 
			label: "test-item",
			price: "$5-$15"
		}, 
		eventTime: "2015-10-05T21:02:49.228Z"
    }
]

Or

Option 2

[
    {
        event: “cart-transaction"
        entityId: 17
        entityType: “user"
        targetEntityId: 31
        targetEntityType: “item”
	}, 
...
]

with another event

{
  "event" : "item",
  "entityType" : "item",
  "entityId" : "31",
  "properties" : {
    category: ["5", "8"], 
	manufacturer: "55", 
	label: "test-item",
	price: "$1-$5"
  }
}

2017-05-17 2:24 GMT+02:00 Pat Ferrel <pat@occamsmachete.com <ma...@occamsmachete.com>>:
Queries with item-sets is only in UR 0.6.0, RC1 in the develop branch now so new docs are not live but this page describes the format of all usage type events: http://actionml.com/docs/ur_input <http://actionml.com/docs/ur_input> which will not change. Think of entityType and targetEntityType as boilerplate always implied by the event name. Just leave them “user” and “item”. The event name is not important except as a user readable id, it is used to group like events.

PIO defines the input event formats and does not allow arrays of ids but does allow arrays of events. I don’t think this is in the SDKs yet but using REST you can send a JSON array of no more than 50 events:

[
    {
        event: “add-to-cart"
        entityId: cart-id
        entityType: “user"
        targetEntityId: product-id1
        targetEntityType: “item”
    },{
        event: “add-to-cart"
        entityId: cart-id
        entityType: “user"
        targetEntityId: product-id2
        targetEntityType: “item”
    }
]

But the typical way to do this would be either as a “purchased-together” when the cart is purchased or with each add-to-cart, one item at a time whichever is easier.




On May 16, 2017, at 2:22 PM, Dennis Honders <dennishonders@gmail.com <ma...@gmail.com>> wrote:

Okay, sounds a bit clearer. 
When I look at the docs: http://actionml.com/docs/ur_input <http://actionml.com/docs/ur_input>, it's still not that clear how the data is send to the eventserver for training. 

"Each cart would have a “user-id” or unique identifier per cart"

In my case, this is the transaction id (cart, user-id in the json) with the item ids that belong to the transaction as property? Or can the TargetEntityType take an array?

2017-05-16 23:01 GMT+02:00 Pat Ferrel <pat@occamsmachete.com <ma...@occamsmachete.com>>:
If you want “things that belong in this same shopping cart” you need to train a model on shopping carts. Each cart would have a “user-id” or unique identifier per cart (nor really a user-id but that is how it would be input), then you would request item-set recommendations for the current contents of the shopping cart. 

If you make the same query after training on user events like “purchase” you will get similar items. This may give you items that look a lot like what you have in the cart already and not be what you want. You want things that go with the cart contents not things like the cart contents.

In this sense the template you were using before is incorrect, you should have used “complimentary purchases". But no worry the UR does both (and others), you just need to input different event encodings to get the 2 different results.



On May 16, 2017, at 12:50 PM, Dennis Honders <dennishonders@gmail.com <ma...@gmail.com>> wrote:

​My intent was not to mix the user id and item ids but maybe show a list of recommendations by the user id and another list by the item ids. 
The current use case is shopping cart recommendations. So I both have a user id and a list of item ids in the shopping cart. 

2017-05-16 19:42 GMT+02:00 Pat Ferrel <pat@occamsmachete.com <ma...@occamsmachete.com>>:
Answers below:


On May 16, 2017, at 10:19 AM, Dennis Honders <dennishonders@gmail.com <ma...@gmail.com>> wrote:

Hi,

1. 
I already used similar product template for experimenting. 
https://predictionio.incubator.apache.org/templates/similarproduct/quickstart/ <https://predictionio.incubator.apache.org/templates/similarproduct/quickstart/>

For UR, are the data queries for the eventserver about the same, but can take more properties? In my case three events. Set users, set items and set buys. 

The UR only needs the buys and determines users and items from the buys, you’d do better is you have other events like product detail views, or category of item bought, etc.

2. 
I have coordinates for the users. Is this supported as property?

Yes to location but lat/lon is problematic. Some area location like postal code or something like country+province+city works much better. These need to be able to contain more than one person so lat/lon is theoretically not applicable since it is too fine grained.

Note: in my case I like to make predictions by user id and by an array of item ids which is supported, also for products that are never bought for cold start. I have item properties like category id, manufacturer id, label and price range. 

All are supported but I’ll warn that you should test these results, mixing user-id and item-sets has no theoretical basis for working and without correct boosting of one over the other may interfere and create less good results. Also item-sets can work to produce either "similar items" or “complimentary items” as in things you might want in the same shopping cart. These require different model building.

How are you generating the array of items? what is your goal for this? If you want items similar to the one being viewed—on the current page for instance, use an item-based query, it will return similar items to the one viewed and can mix with user-based items.

In general everything you mention is supported but my gut feel is that it may be overly complicated so I’d advise A/B testing with a stripped down simple query against this query to see if it really does produce better conversions. Let you data be your guide—intuition must be tested. Adding rules is often needed and is supported but may also reduce conversion lift in unexpected ways.

Thanks in advance




Re: Data UR

Posted by Dennis Honders <de...@gmail.com>.
So it will be event $set?
{
  *"event" : "$set",*
  "entityType" : "item",
  "entityId" : "31",
  "properties" : {
    category: ["5", "8"],
manufacturer: "55",
label: "test-item",
price: "$1-$5"
  }
}

And the eventdate can be applied to event cart-transaction?



2017-05-17 17:13 GMT+02:00 Pat Ferrel <pa...@occamsmachete.com>:

> With multiple event types there appears to be a serious bug that kills
> training. I don’t advise RC1 at present. If you don’t get the error, it
> should work correctly but if you do, you’ll have to wait for a fix.
>
> Option 2 is correct, properties are applied to the entityId, which would
> be the cart. Properties for items are only set with $set events and only
> used in filters and boosts. $set allows you to set them once, if we read
> the usage events they would be set with every event.
>
>
> On May 17, 2017, at 8:07 AM, Dennis Honders <de...@gmail.com>
> wrote:
>
> I have the UR 0.6.0 (develop) installed.
> I only have one order that has more than 50 items, so I can easily exclude
> that one.
>
> The trainingsdata will be option 1 or 2?
>
> *Sampledata: *
> order-id: 17
> items: 31, 32, 33
>
> Option 1
>
>  [
>     {
>         event: “cart-transaction"
>         entityId: 17
>         entityType: “user"
>         targetEntityId: 31
>         targetEntityType: “item”,
> properties: {
> category: ["5", "8"],
> manufacturer: "55",
> label: "test-item",
> price: "$1-$5"
> },
> eventTime: "2015-10-05T21:02:49.228Z"
>     },{
>         event: “cart-transaction"
>         entityId: 17
>         entityType: “user"
>         targetEntityId: 32
>         targetEntityType: “item”,
> properties: {
> category: ["15", "18"],
> manufacturer: "66",
> label: "test-item",
> price: "$1-$5"
> },
> eventTime: "2015-10-05T21:02:49.228Z"
>     },{
>         event: “cart-transaction"
>         entityId: 17
>         entityType: “user"
>         targetEntityId: 33
>         targetEntityType: “item”,
> properties: {
> category: ["25", "28"],
> manufacturer: "77",
> label: "test-item",
> price: "$5-$15"
> },
> eventTime: "2015-10-05T21:02:49.228Z"
>     }
> ]
>
> *Or*
>
> *Option 2*
>
> [
>     {
>         *event: “cart-transaction"*
>         entityId: 17
>         entityType: “user"
>         targetEntityId: 31
>         targetEntityType: “item”
> },
> ...
> ]
>
> with another event
>
> {
>   *"event" : "item",*
>   "entityType" : "item",
>   "entityId" : "31",
>   "properties" : {
>     category: ["5", "8"],
> manufacturer: "55",
> label: "test-item",
> price: "$1-$5"
>   }
> }
>
> 2017-05-17 2:24 GMT+02:00 Pat Ferrel <pa...@occamsmachete.com>:
>
>> Queries with item-sets is only in UR 0.6.0, RC1 in the develop branch now
>> so new docs are not live but this page describes the format of all usage
>> type events: http://actionml.com/docs/ur_input which will not change.
>> Think of entityType and targetEntityType as boilerplate always implied by
>> the event name. Just leave them “user” and “item”. The event name is not
>> important except as a user readable id, it is used to group like events.
>>
>> PIO defines the input event formats and does not allow arrays of ids but
>> does allow arrays of events. I don’t think this is in the SDKs yet but
>> using REST you can send a JSON array of no more than 50 events:
>>
>> [
>>     {
>>         event: “add-to-cart"
>>         entityId: cart-id
>>         entityType: “user"
>>         targetEntityId: product-id1
>>         targetEntityType: “item”
>>     },{
>>         event: “add-to-cart"
>>         entityId: cart-id
>>         entityType: “user"
>>         targetEntityId: product-id2
>>         targetEntityType: “item”
>>     }
>> ]
>>
>> But the typical way to do this would be either as a “purchased-together”
>> when the cart is purchased or with each add-to-cart, one item at a time
>> whichever is easier.
>>
>>
>>
>>
>> On May 16, 2017, at 2:22 PM, Dennis Honders <de...@gmail.com>
>> wrote:
>>
>> Okay, sounds a bit clearer.
>> When I look at the docs: http://actionml.com/docs/ur_input, it's still
>> not that clear how the data is send to the eventserver for training.
>>
>> "Each cart would have a “user-id” or unique identifier per cart"
>>
>> In my case, this is the transaction id (cart, user-id in the json) with
>> the item ids that belong to the transaction as property? Or can the
>> TargetEntityType take an array?
>>
>> 2017-05-16 23:01 GMT+02:00 Pat Ferrel <pa...@occamsmachete.com>:
>>
>>> If you want “things that belong in this same shopping cart” you need to
>>> train a model on shopping carts. Each cart would have a “user-id” or unique
>>> identifier per cart (nor really a user-id but that is how it would be
>>> input), then you would request item-set recommendations for the current
>>> contents of the shopping cart.
>>>
>>> If you make the same query after training on user events like “purchase”
>>> you will get similar items. This may give you items that look a lot like
>>> what you have in the cart already and not be what you want. You want things
>>> that go with the cart contents not things like the cart contents.
>>>
>>> In this sense the template you were using before is incorrect, you
>>> should have used “complimentary purchases". But no worry the UR does both
>>> (and others), you just need to input different event encodings to get the 2
>>> different results.
>>>
>>>
>>>
>>> On May 16, 2017, at 12:50 PM, Dennis Honders <de...@gmail.com>
>>> wrote:
>>>
>>> ​My intent was not to mix the user id and item ids but maybe show a list
>>> of recommendations by the user id and another list by the item ids.
>>> The current use case is shopping cart recommendations. So I both have a
>>> user id and a list of item ids in the shopping cart.
>>>
>>> 2017-05-16 19:42 GMT+02:00 Pat Ferrel <pa...@occamsmachete.com>:
>>>
>>>> Answers below:
>>>>
>>>>
>>>> On May 16, 2017, at 10:19 AM, Dennis Honders <de...@gmail.com>
>>>> wrote:
>>>>
>>>> Hi,
>>>>
>>>> 1.
>>>> I already used similar product template for experimenting.
>>>> https://predictionio.incubator.apache.org/templates/similarp
>>>> roduct/quickstart/
>>>>
>>>> For UR, are the data queries for the eventserver about the same, but
>>>> can take more properties? In my case three events. Set users, set items and
>>>> set buys.
>>>>
>>>> The UR only needs the buys and determines users and items from the
>>>> buys, you’d do better is you have other events like product detail views,
>>>> or category of item bought, etc.
>>>>
>>>> 2.
>>>> I have coordinates for the users. Is this supported as property?
>>>>
>>>> Yes to location but lat/lon is problematic. Some area location like
>>>> postal code or something like country+province+city works much better.
>>>> These need to be able to contain more than one person so lat/lon is
>>>> theoretically not applicable since it is too fine grained.
>>>>
>>>> Note: in my case I like to make predictions by user id and by an array
>>>> of item ids which is supported, also for products that are never bought for
>>>> cold start. I have item properties like category id, manufacturer id, label
>>>> and price range.
>>>>
>>>> All are supported but I’ll warn that you should test these results,
>>>> mixing user-id and item-sets has no theoretical basis for working and
>>>> without correct boosting of one over the other may interfere and create
>>>> less good results. Also item-sets can work to produce either "similar
>>>> items" or “complimentary items” as in things you might want in the same
>>>> shopping cart. These require different model building.
>>>>
>>>> How are you generating the array of items? what is your goal for this?
>>>> If you want items similar to the one being viewed—on the current page for
>>>> instance, use an item-based query, it will return similar items to the one
>>>> viewed and can mix with user-based items.
>>>>
>>>> In general everything you mention is supported but my gut feel is that
>>>> it may be overly complicated so I’d advise A/B testing with a stripped down
>>>> simple query against this query to see if it really does produce better
>>>> conversions. Let you data be your guide—intuition must be tested. Adding
>>>> rules is often needed and is supported but may also reduce conversion lift
>>>> in unexpected ways.
>>>>
>>>> Thanks in advance
>>>>
>>>
>

Re: Data UR

Posted by Pat Ferrel <pa...@occamsmachete.com>.
With multiple event types there appears to be a serious bug that kills training. I don’t advise RC1 at present. If you don’t get the error, it should work correctly but if you do, you’ll have to wait for a fix.

Option 2 is correct, properties are applied to the entityId, which would be the cart. Properties for items are only set with $set events and only used in filters and boosts. $set allows you to set them once, if we read the usage events they would be set with every event.


On May 17, 2017, at 8:07 AM, Dennis Honders <de...@gmail.com> wrote:

I have the UR 0.6.0 (develop) installed. 
I only have one order that has more than 50 items, so I can easily exclude that one. 

The trainingsdata will be option 1 or 2? 

Sampledata: 
order-id: 17
items: 31, 32, 33
 
Option 1

 [
    {
        event: “cart-transaction"
        entityId: 17
        entityType: “user"
        targetEntityId: 31
        targetEntityType: “item”, 
		properties: {
			category: ["5", "8"], 
			manufacturer: "55", 
			label: "test-item",
			price: "$1-$5"
		}, 
		eventTime: "2015-10-05T21:02:49.228Z"
    },{
        event: “cart-transaction"
        entityId: 17
        entityType: “user"
        targetEntityId: 32
        targetEntityType: “item”, 
		properties: {
			category: ["15", "18"], 
			manufacturer: "66", 
			label: "test-item",
			price: "$1-$5"
		}, 
		eventTime: "2015-10-05T21:02:49.228Z"
    },{
        event: “cart-transaction"
        entityId: 17
        entityType: “user"
        targetEntityId: 33
        targetEntityType: “item”, 
		properties: {
			category: ["25", "28"], 
			manufacturer: "77", 
			label: "test-item",
			price: "$5-$15"
		}, 
		eventTime: "2015-10-05T21:02:49.228Z"
    }
]

Or

Option 2

[
    {
        event: “cart-transaction"
        entityId: 17
        entityType: “user"
        targetEntityId: 31
        targetEntityType: “item”
	}, 
...
]

with another event

{
  "event" : "item",
  "entityType" : "item",
  "entityId" : "31",
  "properties" : {
    category: ["5", "8"], 
	manufacturer: "55", 
	label: "test-item",
	price: "$1-$5"
  }
}

2017-05-17 2:24 GMT+02:00 Pat Ferrel <pat@occamsmachete.com <ma...@occamsmachete.com>>:
Queries with item-sets is only in UR 0.6.0, RC1 in the develop branch now so new docs are not live but this page describes the format of all usage type events: http://actionml.com/docs/ur_input <http://actionml.com/docs/ur_input> which will not change. Think of entityType and targetEntityType as boilerplate always implied by the event name. Just leave them “user” and “item”. The event name is not important except as a user readable id, it is used to group like events.

PIO defines the input event formats and does not allow arrays of ids but does allow arrays of events. I don’t think this is in the SDKs yet but using REST you can send a JSON array of no more than 50 events:

[
    {
        event: “add-to-cart"
        entityId: cart-id
        entityType: “user"
        targetEntityId: product-id1
        targetEntityType: “item”
    },{
        event: “add-to-cart"
        entityId: cart-id
        entityType: “user"
        targetEntityId: product-id2
        targetEntityType: “item”
    }
]

But the typical way to do this would be either as a “purchased-together” when the cart is purchased or with each add-to-cart, one item at a time whichever is easier.




On May 16, 2017, at 2:22 PM, Dennis Honders <dennishonders@gmail.com <ma...@gmail.com>> wrote:

Okay, sounds a bit clearer. 
When I look at the docs: http://actionml.com/docs/ur_input <http://actionml.com/docs/ur_input>, it's still not that clear how the data is send to the eventserver for training. 

"Each cart would have a “user-id” or unique identifier per cart"

In my case, this is the transaction id (cart, user-id in the json) with the item ids that belong to the transaction as property? Or can the TargetEntityType take an array?

2017-05-16 23:01 GMT+02:00 Pat Ferrel <pat@occamsmachete.com <ma...@occamsmachete.com>>:
If you want “things that belong in this same shopping cart” you need to train a model on shopping carts. Each cart would have a “user-id” or unique identifier per cart (nor really a user-id but that is how it would be input), then you would request item-set recommendations for the current contents of the shopping cart. 

If you make the same query after training on user events like “purchase” you will get similar items. This may give you items that look a lot like what you have in the cart already and not be what you want. You want things that go with the cart contents not things like the cart contents.

In this sense the template you were using before is incorrect, you should have used “complimentary purchases". But no worry the UR does both (and others), you just need to input different event encodings to get the 2 different results.



On May 16, 2017, at 12:50 PM, Dennis Honders <dennishonders@gmail.com <ma...@gmail.com>> wrote:

​My intent was not to mix the user id and item ids but maybe show a list of recommendations by the user id and another list by the item ids. 
The current use case is shopping cart recommendations. So I both have a user id and a list of item ids in the shopping cart. 

2017-05-16 19:42 GMT+02:00 Pat Ferrel <pat@occamsmachete.com <ma...@occamsmachete.com>>:
Answers below:


On May 16, 2017, at 10:19 AM, Dennis Honders <dennishonders@gmail.com <ma...@gmail.com>> wrote:

Hi,

1. 
I already used similar product template for experimenting. 
https://predictionio.incubator.apache.org/templates/similarproduct/quickstart/ <https://predictionio.incubator.apache.org/templates/similarproduct/quickstart/>

For UR, are the data queries for the eventserver about the same, but can take more properties? In my case three events. Set users, set items and set buys. 

The UR only needs the buys and determines users and items from the buys, you’d do better is you have other events like product detail views, or category of item bought, etc.

2. 
I have coordinates for the users. Is this supported as property?

Yes to location but lat/lon is problematic. Some area location like postal code or something like country+province+city works much better. These need to be able to contain more than one person so lat/lon is theoretically not applicable since it is too fine grained.

Note: in my case I like to make predictions by user id and by an array of item ids which is supported, also for products that are never bought for cold start. I have item properties like category id, manufacturer id, label and price range. 

All are supported but I’ll warn that you should test these results, mixing user-id and item-sets has no theoretical basis for working and without correct boosting of one over the other may interfere and create less good results. Also item-sets can work to produce either "similar items" or “complimentary items” as in things you might want in the same shopping cart. These require different model building.

How are you generating the array of items? what is your goal for this? If you want items similar to the one being viewed—on the current page for instance, use an item-based query, it will return similar items to the one viewed and can mix with user-based items.

In general everything you mention is supported but my gut feel is that it may be overly complicated so I’d advise A/B testing with a stripped down simple query against this query to see if it really does produce better conversions. Let you data be your guide—intuition must be tested. Adding rules is often needed and is supported but may also reduce conversion lift in unexpected ways.

Thanks in advance


Re: Data UR

Posted by Dennis Honders <de...@gmail.com>.
I have the UR 0.6.0 (develop) installed.
I only have one order that has more than 50 items, so I can easily exclude
that one.

The trainingsdata will be option 1 or 2?

*Sampledata: *
order-id: 17
items: 31, 32, 33

Option 1

 [
    {
        event: “cart-transaction"
        entityId: 17
        entityType: “user"
        targetEntityId: 31
        targetEntityType: “item”,
properties: {
category: ["5", "8"],
manufacturer: "55",
label: "test-item",
price: "$1-$5"
},
eventTime: "2015-10-05T21:02:49.228Z"
    },{
        event: “cart-transaction"
        entityId: 17
        entityType: “user"
        targetEntityId: 32
        targetEntityType: “item”,
properties: {
category: ["15", "18"],
manufacturer: "66",
label: "test-item",
price: "$1-$5"
},
eventTime: "2015-10-05T21:02:49.228Z"
    },{
        event: “cart-transaction"
        entityId: 17
        entityType: “user"
        targetEntityId: 33
        targetEntityType: “item”,
properties: {
category: ["25", "28"],
manufacturer: "77",
label: "test-item",
price: "$5-$15"
},
eventTime: "2015-10-05T21:02:49.228Z"
    }
]

*Or*

*Option 2*

[
    {
        *event: “cart-transaction"*
        entityId: 17
        entityType: “user"
        targetEntityId: 31
        targetEntityType: “item”
},
...
]

with another event

{
  *"event" : "item",*
  "entityType" : "item",
  "entityId" : "31",
  "properties" : {
    category: ["5", "8"],
manufacturer: "55",
label: "test-item",
price: "$1-$5"
  }
}

2017-05-17 2:24 GMT+02:00 Pat Ferrel <pa...@occamsmachete.com>:

> Queries with item-sets is only in UR 0.6.0, RC1 in the develop branch now
> so new docs are not live but this page describes the format of all usage
> type events: http://actionml.com/docs/ur_input which will not change.
> Think of entityType and targetEntityType as boilerplate always implied by
> the event name. Just leave them “user” and “item”. The event name is not
> important except as a user readable id, it is used to group like events.
>
> PIO defines the input event formats and does not allow arrays of ids but
> does allow arrays of events. I don’t think this is in the SDKs yet but
> using REST you can send a JSON array of no more than 50 events:
>
> [
>     {
>         event: “add-to-cart"
>         entityId: cart-id
>         entityType: “user"
>         targetEntityId: product-id1
>         targetEntityType: “item”
>     },{
>         event: “add-to-cart"
>         entityId: cart-id
>         entityType: “user"
>         targetEntityId: product-id2
>         targetEntityType: “item”
>     }
> ]
>
> But the typical way to do this would be either as a “purchased-together”
> when the cart is purchased or with each add-to-cart, one item at a time
> whichever is easier.
>
>
>
>
> On May 16, 2017, at 2:22 PM, Dennis Honders <de...@gmail.com>
> wrote:
>
> Okay, sounds a bit clearer.
> When I look at the docs: http://actionml.com/docs/ur_input, it's still
> not that clear how the data is send to the eventserver for training.
>
> "Each cart would have a “user-id” or unique identifier per cart"
>
> In my case, this is the transaction id (cart, user-id in the json) with
> the item ids that belong to the transaction as property? Or can the
> TargetEntityType take an array?
>
> 2017-05-16 23:01 GMT+02:00 Pat Ferrel <pa...@occamsmachete.com>:
>
>> If you want “things that belong in this same shopping cart” you need to
>> train a model on shopping carts. Each cart would have a “user-id” or unique
>> identifier per cart (nor really a user-id but that is how it would be
>> input), then you would request item-set recommendations for the current
>> contents of the shopping cart.
>>
>> If you make the same query after training on user events like “purchase”
>> you will get similar items. This may give you items that look a lot like
>> what you have in the cart already and not be what you want. You want things
>> that go with the cart contents not things like the cart contents.
>>
>> In this sense the template you were using before is incorrect, you should
>> have used “complimentary purchases". But no worry the UR does both (and
>> others), you just need to input different event encodings to get the 2
>> different results.
>>
>>
>>
>> On May 16, 2017, at 12:50 PM, Dennis Honders <de...@gmail.com>
>> wrote:
>>
>> ​My intent was not to mix the user id and item ids but maybe show a list
>> of recommendations by the user id and another list by the item ids.
>> The current use case is shopping cart recommendations. So I both have a
>> user id and a list of item ids in the shopping cart.
>>
>> 2017-05-16 19:42 GMT+02:00 Pat Ferrel <pa...@occamsmachete.com>:
>>
>>> Answers below:
>>>
>>>
>>> On May 16, 2017, at 10:19 AM, Dennis Honders <de...@gmail.com>
>>> wrote:
>>>
>>> Hi,
>>>
>>> 1.
>>> I already used similar product template for experimenting.
>>> https://predictionio.incubator.apache.org/templates/similarp
>>> roduct/quickstart/
>>>
>>> For UR, are the data queries for the eventserver about the same, but can
>>> take more properties? In my case three events. Set users, set items and set
>>> buys.
>>>
>>> The UR only needs the buys and determines users and items from the buys,
>>> you’d do better is you have other events like product detail views, or
>>> category of item bought, etc.
>>>
>>> 2.
>>> I have coordinates for the users. Is this supported as property?
>>>
>>> Yes to location but lat/lon is problematic. Some area location like
>>> postal code or something like country+province+city works much better.
>>> These need to be able to contain more than one person so lat/lon is
>>> theoretically not applicable since it is too fine grained.
>>>
>>> Note: in my case I like to make predictions by user id and by an array
>>> of item ids which is supported, also for products that are never bought for
>>> cold start. I have item properties like category id, manufacturer id, label
>>> and price range.
>>>
>>> All are supported but I’ll warn that you should test these results,
>>> mixing user-id and item-sets has no theoretical basis for working and
>>> without correct boosting of one over the other may interfere and create
>>> less good results. Also item-sets can work to produce either "similar
>>> items" or “complimentary items” as in things you might want in the same
>>> shopping cart. These require different model building.
>>>
>>> How are you generating the array of items? what is your goal for this?
>>> If you want items similar to the one being viewed—on the current page for
>>> instance, use an item-based query, it will return similar items to the one
>>> viewed and can mix with user-based items.
>>>
>>> In general everything you mention is supported but my gut feel is that
>>> it may be overly complicated so I’d advise A/B testing with a stripped down
>>> simple query against this query to see if it really does produce better
>>> conversions. Let you data be your guide—intuition must be tested. Adding
>>> rules is often needed and is supported but may also reduce conversion lift
>>> in unexpected ways.
>>>
>>> Thanks in advance
>>>
>>>
>>
>>
>
>

Re: Data UR

Posted by Pat Ferrel <pa...@occamsmachete.com>.
Queries with item-sets is only in UR 0.6.0, RC1 in the develop branch now so new docs are not live but this page describes the format of all usage type events: http://actionml.com/docs/ur_input <http://actionml.com/docs/ur_input> which will not change. Think of entityType and targetEntityType as boilerplate always implied by the event name. Just leave them “user” and “item”. The event name is not important except as a user readable id, it is used to group like events.

PIO defines the input event formats and does not allow arrays of ids but does allow arrays of events. I don’t think this is in the SDKs yet but using REST you can send a JSON array of no more than 50 events:

[
    {
        event: “add-to-cart"
        entityId: cart-id
        entityType: “user"
        targetEntityId: product-id1
        targetEntityType: “item”
    },{
        event: “add-to-cart"
        entityId: cart-id
        entityType: “user"
        targetEntityId: product-id2
        targetEntityType: “item”
    }
]

But the typical way to do this would be either as a “purchased-together” when the cart is purchased or with each add-to-cart, one item at a time whichever is easier.




On May 16, 2017, at 2:22 PM, Dennis Honders <de...@gmail.com> wrote:

Okay, sounds a bit clearer. 
When I look at the docs: http://actionml.com/docs/ur_input <http://actionml.com/docs/ur_input>, it's still not that clear how the data is send to the eventserver for training. 

"Each cart would have a “user-id” or unique identifier per cart"

In my case, this is the transaction id (cart, user-id in the json) with the item ids that belong to the transaction as property? Or can the TargetEntityType take an array?

2017-05-16 23:01 GMT+02:00 Pat Ferrel <pat@occamsmachete.com <ma...@occamsmachete.com>>:
If you want “things that belong in this same shopping cart” you need to train a model on shopping carts. Each cart would have a “user-id” or unique identifier per cart (nor really a user-id but that is how it would be input), then you would request item-set recommendations for the current contents of the shopping cart. 

If you make the same query after training on user events like “purchase” you will get similar items. This may give you items that look a lot like what you have in the cart already and not be what you want. You want things that go with the cart contents not things like the cart contents.

In this sense the template you were using before is incorrect, you should have used “complimentary purchases". But no worry the UR does both (and others), you just need to input different event encodings to get the 2 different results.



On May 16, 2017, at 12:50 PM, Dennis Honders <dennishonders@gmail.com <ma...@gmail.com>> wrote:

​My intent was not to mix the user id and item ids but maybe show a list of recommendations by the user id and another list by the item ids. 
The current use case is shopping cart recommendations. So I both have a user id and a list of item ids in the shopping cart. 

2017-05-16 19:42 GMT+02:00 Pat Ferrel <pat@occamsmachete.com <ma...@occamsmachete.com>>:
Answers below:


On May 16, 2017, at 10:19 AM, Dennis Honders <dennishonders@gmail.com <ma...@gmail.com>> wrote:

Hi,

1. 
I already used similar product template for experimenting. 
https://predictionio.incubator.apache.org/templates/similarproduct/quickstart/ <https://predictionio.incubator.apache.org/templates/similarproduct/quickstart/>

For UR, are the data queries for the eventserver about the same, but can take more properties? In my case three events. Set users, set items and set buys. 

The UR only needs the buys and determines users and items from the buys, you’d do better is you have other events like product detail views, or category of item bought, etc.

2. 
I have coordinates for the users. Is this supported as property?

Yes to location but lat/lon is problematic. Some area location like postal code or something like country+province+city works much better. These need to be able to contain more than one person so lat/lon is theoretically not applicable since it is too fine grained.

Note: in my case I like to make predictions by user id and by an array of item ids which is supported, also for products that are never bought for cold start. I have item properties like category id, manufacturer id, label and price range. 

All are supported but I’ll warn that you should test these results, mixing user-id and item-sets has no theoretical basis for working and without correct boosting of one over the other may interfere and create less good results. Also item-sets can work to produce either "similar items" or “complimentary items” as in things you might want in the same shopping cart. These require different model building.

How are you generating the array of items? what is your goal for this? If you want items similar to the one being viewed—on the current page for instance, use an item-based query, it will return similar items to the one viewed and can mix with user-based items.

In general everything you mention is supported but my gut feel is that it may be overly complicated so I’d advise A/B testing with a stripped down simple query against this query to see if it really does produce better conversions. Let you data be your guide—intuition must be tested. Adding rules is often needed and is supported but may also reduce conversion lift in unexpected ways.

Thanks in advance






Re: Data UR

Posted by Dennis Honders <de...@gmail.com>.
Okay, sounds a bit clearer.
When I look at the docs: http://actionml.com/docs/ur_input, it's still not
that clear how the data is send to the eventserver for training.

"Each cart would have a “user-id” or unique identifier per cart"

In my case, this is the transaction id (cart, user-id in the json) with the
item ids that belong to the transaction as property? Or can the
TargetEntityType take an array?

2017-05-16 23:01 GMT+02:00 Pat Ferrel <pa...@occamsmachete.com>:

> If you want “things that belong in this same shopping cart” you need to
> train a model on shopping carts. Each cart would have a “user-id” or unique
> identifier per cart (nor really a user-id but that is how it would be
> input), then you would request item-set recommendations for the current
> contents of the shopping cart.
>
> If you make the same query after training on user events like “purchase”
> you will get similar items. This may give you items that look a lot like
> what you have in the cart already and not be what you want. You want things
> that go with the cart contents not things like the cart contents.
>
> In this sense the template you were using before is incorrect, you should
> have used “complimentary purchases". But no worry the UR does both (and
> others), you just need to input different event encodings to get the 2
> different results.
>
>
>
> On May 16, 2017, at 12:50 PM, Dennis Honders <de...@gmail.com>
> wrote:
>
> ​My intent was not to mix the user id and item ids but maybe show a list
> of recommendations by the user id and another list by the item ids.
> The current use case is shopping cart recommendations. So I both have a
> user id and a list of item ids in the shopping cart.
>
> 2017-05-16 19:42 GMT+02:00 Pat Ferrel <pa...@occamsmachete.com>:
>
>> Answers below:
>>
>>
>> On May 16, 2017, at 10:19 AM, Dennis Honders <de...@gmail.com>
>> wrote:
>>
>> Hi,
>>
>> 1.
>> I already used similar product template for experimenting.
>> https://predictionio.incubator.apache.org/templates/
>> similarproduct/quickstart/
>>
>> For UR, are the data queries for the eventserver about the same, but can
>> take more properties? In my case three events. Set users, set items and set
>> buys.
>>
>> The UR only needs the buys and determines users and items from the buys,
>> you’d do better is you have other events like product detail views, or
>> category of item bought, etc.
>>
>> 2.
>> I have coordinates for the users. Is this supported as property?
>>
>> Yes to location but lat/lon is problematic. Some area location like
>> postal code or something like country+province+city works much better.
>> These need to be able to contain more than one person so lat/lon is
>> theoretically not applicable since it is too fine grained.
>>
>> Note: in my case I like to make predictions by user id and by an array of
>> item ids which is supported, also for products that are never bought for
>> cold start. I have item properties like category id, manufacturer id, label
>> and price range.
>>
>> All are supported but I’ll warn that you should test these results,
>> mixing user-id and item-sets has no theoretical basis for working and
>> without correct boosting of one over the other may interfere and create
>> less good results. Also item-sets can work to produce either "similar
>> items" or “complimentary items” as in things you might want in the same
>> shopping cart. These require different model building.
>>
>> How are you generating the array of items? what is your goal for this? If
>> you want items similar to the one being viewed—on the current page for
>> instance, use an item-based query, it will return similar items to the one
>> viewed and can mix with user-based items.
>>
>> In general everything you mention is supported but my gut feel is that it
>> may be overly complicated so I’d advise A/B testing with a stripped down
>> simple query against this query to see if it really does produce better
>> conversions. Let you data be your guide—intuition must be tested. Adding
>> rules is often needed and is supported but may also reduce conversion lift
>> in unexpected ways.
>>
>> Thanks in advance
>>
>>
>
>

Re: Data UR

Posted by Pat Ferrel <pa...@occamsmachete.com>.
If you want “things that belong in this same shopping cart” you need to train a model on shopping carts. Each cart would have a “user-id” or unique identifier per cart (nor really a user-id but that is how it would be input), then you would request item-set recommendations for the current contents of the shopping cart. 

If you make the same query after training on user events like “purchase” you will get similar items. This may give you items that look a lot like what you have in the cart already and not be what you want. You want things that go with the cart contents not things like the cart contents.

In this sense the template you were using before is incorrect, you should have used “complimentary purchases". But no worry the UR does both (and others), you just need to input different event encodings to get the 2 different results.


On May 16, 2017, at 12:50 PM, Dennis Honders <de...@gmail.com> wrote:

​My intent was not to mix the user id and item ids but maybe show a list of recommendations by the user id and another list by the item ids. 
The current use case is shopping cart recommendations. So I both have a user id and a list of item ids in the shopping cart. 

2017-05-16 19:42 GMT+02:00 Pat Ferrel <pat@occamsmachete.com <ma...@occamsmachete.com>>:
Answers below:


On May 16, 2017, at 10:19 AM, Dennis Honders <dennishonders@gmail.com <ma...@gmail.com>> wrote:

Hi,

1. 
I already used similar product template for experimenting. 
https://predictionio.incubator.apache.org/templates/similarproduct/quickstart/ <https://predictionio.incubator.apache.org/templates/similarproduct/quickstart/>

For UR, are the data queries for the eventserver about the same, but can take more properties? In my case three events. Set users, set items and set buys. 

The UR only needs the buys and determines users and items from the buys, you’d do better is you have other events like product detail views, or category of item bought, etc.

2. 
I have coordinates for the users. Is this supported as property?

Yes to location but lat/lon is problematic. Some area location like postal code or something like country+province+city works much better. These need to be able to contain more than one person so lat/lon is theoretically not applicable since it is too fine grained.

Note: in my case I like to make predictions by user id and by an array of item ids which is supported, also for products that are never bought for cold start. I have item properties like category id, manufacturer id, label and price range. 

All are supported but I’ll warn that you should test these results, mixing user-id and item-sets has no theoretical basis for working and without correct boosting of one over the other may interfere and create less good results. Also item-sets can work to produce either "similar items" or “complimentary items” as in things you might want in the same shopping cart. These require different model building.

How are you generating the array of items? what is your goal for this? If you want items similar to the one being viewed—on the current page for instance, use an item-based query, it will return similar items to the one viewed and can mix with user-based items.

In general everything you mention is supported but my gut feel is that it may be overly complicated so I’d advise A/B testing with a stripped down simple query against this query to see if it really does produce better conversions. Let you data be your guide—intuition must be tested. Adding rules is often needed and is supported but may also reduce conversion lift in unexpected ways.

Thanks in advance




Re: Data UR

Posted by Dennis Honders <de...@gmail.com>.
​My intent was not to mix the user id and item ids but maybe show a list of
recommendations by the user id and another list by the item ids.
The current use case is shopping cart recommendations. So I both have a
user id and a list of item ids in the shopping cart.

2017-05-16 19:42 GMT+02:00 Pat Ferrel <pa...@occamsmachete.com>:

> Answers below:
>
>
> On May 16, 2017, at 10:19 AM, Dennis Honders <de...@gmail.com>
> wrote:
>
> Hi,
>
> 1.
> I already used similar product template for experimenting.
> https://predictionio.incubator.apache.org/templates/similarproduct/
> quickstart/
>
> For UR, are the data queries for the eventserver about the same, but can
> take more properties? In my case three events. Set users, set items and set
> buys.
>
> The UR only needs the buys and determines users and items from the buys,
> you’d do better is you have other events like product detail views, or
> category of item bought, etc.
>
> 2.
> I have coordinates for the users. Is this supported as property?
>
> Yes to location but lat/lon is problematic. Some area location like postal
> code or something like country+province+city works much better. These need
> to be able to contain more than one person so lat/lon is theoretically not
> applicable since it is too fine grained.
>
> Note: in my case I like to make predictions by user id and by an array of
> item ids which is supported, also for products that are never bought for
> cold start. I have item properties like category id, manufacturer id, label
> and price range.
>
> All are supported but I’ll warn that you should test these results, mixing
> user-id and item-sets has no theoretical basis for working and without
> correct boosting of one over the other may interfere and create less good
> results. Also item-sets can work to produce either "similar items"
> or “complimentary items” as in things you might want in the same shopping
> cart. These require different model building.
>
> How are you generating the array of items? what is your goal for this? If
> you want items similar to the one being viewed—on the current page for
> instance, use an item-based query, it will return similar items to the one
> viewed and can mix with user-based items.
>
> In general everything you mention is supported but my gut feel is that it
> may be overly complicated so I’d advise A/B testing with a stripped down
> simple query against this query to see if it really does produce better
> conversions. Let you data be your guide—intuition must be tested. Adding
> rules is often needed and is supported but may also reduce conversion lift
> in unexpected ways.
>
> Thanks in advance
>
>

Re: Data UR

Posted by Pat Ferrel <pa...@occamsmachete.com>.
Answers below:


On May 16, 2017, at 10:19 AM, Dennis Honders <de...@gmail.com> wrote:

Hi,

1. 
I already used similar product template for experimenting. 
https://predictionio.incubator.apache.org/templates/similarproduct/quickstart/ <https://predictionio.incubator.apache.org/templates/similarproduct/quickstart/>

For UR, are the data queries for the eventserver about the same, but can take more properties? In my case three events. Set users, set items and set buys. 

The UR only needs the buys and determines users and items from the buys, you’d do better is you have other events like product detail views, or category of item bought, etc.

2. 
I have coordinates for the users. Is this supported as property?

Yes to location but lat/lon is problematic. Some area location like postal code or something like country+province+city works much better. These need to be able to contain more than one person so lat/lon is theoretically not applicable since it is too fine grained.

Note: in my case I like to make predictions by user id and by an array of item ids which is supported, also for products that are never bought for cold start. I have item properties like category id, manufacturer id, label and price range. 

All are supported but I’ll warn that you should test these results, mixing user-id and item-sets has no theoretical basis for working and without correct boosting of one over the other may interfere and create less good results. Also item-sets can work to produce either "similar items" or “complimentary items” as in things you might want in the same shopping cart. These require different model building.

How are you generating the array of items? what is your goal for this? If you want items similar to the one being viewed—on the current page for instance, use an item-based query, it will return similar items to the one viewed and can mix with user-based items.

In general everything you mention is supported but my gut feel is that it may be overly complicated so I’d advise A/B testing with a stripped down simple query against this query to see if it really does produce better conversions. Let you data be your guide—intuition must be tested. Adding rules is often needed and is supported but may also reduce conversion lift in unexpected ways.

Thanks in advance