You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Octavian Covalschi <oc...@gmail.com> on 2011/10/17 03:58:29 UTC

Removing an item from all users

Hi there.

We have the taste war running and we are trying to accomplish a realtime
functionality by removing/adding user items on the fly. So far adding and
removing items for particular users has been working fine for us and
recommendations are being changed on the fly.. though it's still in beta...
The last piece is to remove an item from ALL users, it's necessary when the
item is removed from the system... So far I've got this:

public void removeItemFromAllUsers(long itemId) throws NoSuchItemException {
        PreferenceArray prefArray = ((GenericDataModel)
delegate).getPreferencesForItem(itemId);

        if (prefArray.length() > 0) {
            FastByIDMap<PreferenceArray> rawData = ((GenericDataModel)
delegate).getRawUserData();

            for (Preference pr : prefArray) {
                PreferenceArray prefs = rawData.get(pr.getUserID());

                rawData.remove(pr.getUserID());

                if (prefs.length() > 1) {
                    PreferenceArray newPrefs = new
GenericUserPreferenceArray(length - 1);
                    for (int i = 0, j = 0; i < length; i++, j++) {
                        if (prefs.getItemID(i) == itemId) {
                            j--;
                        } else {
                            newPrefs.set(j, prefs.get(i));
                        }
                    }
                    rawData.put(pr.getUserID(), newPrefs);
                }
            }
            delegate = new GenericDataModel(rawData);
        }
    }

I'm still pretty new with mahout and was wondering if there is a better way,
in particular make use of getRawUserData(), which I tried but it didn't
help...

Thank you in advance.

Re: Removing an item from all users

Posted by Octavian Covalschi <oc...@gmail.com>.
That's a good point, on keeping it for a while, however in our case items
that are removed are not relevant, they have been added by mistake or are
inappropriate, so basically they don't bring  any value. Also, handling
'ghost' items adds additional complexity...

We don't generate recommendations for each user in background... the model
is loaded in memory when taste  starts up(along with tomcat). Data for the
model is being loaded from database. The actual recommendations are
retrieved on user's requests and each time user makes an action on items
(likes, etc..) he may see different results in recommendations...


On Sun, Oct 16, 2011 at 9:25 PM, Ted Dunning <te...@gmail.com> wrote:

> On Sun, Oct 16, 2011 at 6:58 PM, Octavian Covalschi <
> octavian.covalschi@gmail.com> wrote:
>
> > ... We have the taste war running and we are trying to accomplish a
> > realtime
> > functionality by removing/adding user items on the fly. So far adding and
> > removing items for particular users has been working fine for us and
> > recommendations are being changed on the fly.. though it's still in
> beta...
> > The last piece is to remove an item from ALL users, it's necessary when
> the
> > item is removed from the system...
>
>
> Two thoughts on this.
>
> First, is it really necessary to remove the item at all other than from
> recommendation results?  After all, the item still defines some aspect of
> user similarity.  Why not keep it around for a while?
>
> Secondly, can you do the removal lazily as you start to generate
> recommendations for a user or in the background in a scan over all users?
>

Re: Removing an item from all users

Posted by Ted Dunning <te...@gmail.com>.
On Sun, Oct 16, 2011 at 6:58 PM, Octavian Covalschi <
octavian.covalschi@gmail.com> wrote:

> ... We have the taste war running and we are trying to accomplish a
> realtime
> functionality by removing/adding user items on the fly. So far adding and
> removing items for particular users has been working fine for us and
> recommendations are being changed on the fly.. though it's still in beta...
> The last piece is to remove an item from ALL users, it's necessary when the
> item is removed from the system...


Two thoughts on this.

First, is it really necessary to remove the item at all other than from
recommendation results?  After all, the item still defines some aspect of
user similarity.  Why not keep it around for a while?

Secondly, can you do the removal lazily as you start to generate
recommendations for a user or in the background in a scan over all users?

Re: Removing an item from all users

Posted by Octavian Covalschi <oc...@gmail.com>.
Thanks. I'll try to make use of Rescorer. Now it's a matter of finding a
good example...



On Mon, Oct 17, 2011 at 8:26 AM, Sean Owen <sr...@gmail.com> wrote:

> Yes, remove the bad items and prefs from your backing store. That gets
> updated then next time you refresh(). Eventually, you want those gone,
> it seems, since they're not useful information.
>
> Among other things, the Rescorer is there to filter out items that are
> useful for the internal CF algorithm, but not a valid recommendation
> (e.g. out-of-stock items). But it could be slightly abused to filter
> out bad data on the fly, on the front end.
>
> You don't need to or want to attach additional info to those user-item
> associations. Presumably the bad-ness is a property of the item. So
> you just need a Rescorer that knows a list of bad item IDs, and
> filters them.
>
> On Mon, Oct 17, 2011 at 1:47 PM, Octavian Covalschi
> <oc...@gmail.com> wrote:
> > I understand. Even though I don't have feed files, we're using MongoDB as
> > the main data source, I can update database for the "old" or "out of
> stock"
> > items...
> >
> > So even in the case of ghost items or those I want to exclude and use
> > Rescorer, I still need to update the in memory model, right? As far I
> > understand it's not enough to update the feed, unless It's being
> > refreshed... Is there a way to store additional optional data along with
> > userId <-> itemId? The only solution I see now is to extend
> > GenericPreference and add additional properties like isOld, price (for
> other
> > purposes), is this a correct approach?
> >
> > PS: Thanks guys for your time!
> >
> > On Mon, Oct 17, 2011 at 2:35 AM, Sean Owen <sr...@gmail.com> wrote:
> >
> >> In general this DataModel is not modifiable, which is why you're
> >> having to hack at it. The idea is to update the underlying data,
> >> ideally -- for example, the file containing the data, perhaps by using
> >> update (delta) files.
> >>
> >> Your approach works though. There is not a better way to do it, if
> >> this is what you're doing.
> >>
> >> As Ted mentioned, keep data on items that are real and valid, just out
> >> of stock or old or something. You can filter them using a Rescorer so
> >> that they don't end up in the output.
> >>
> >> You can do the same thing for ghost data, if you like. That at least
> >> means you have an immediate way of filtering and can wait for a data
> >> reload or update rather than do it every time on the fly.
> >>
> >> On Mon, Oct 17, 2011 at 2:58 AM, Octavian Covalschi
> >> <oc...@gmail.com> wrote:
> >> > Hi there.
> >> >
> >> > We have the taste war running and we are trying to accomplish a
> realtime
> >> > functionality by removing/adding user items on the fly. So far adding
> and
> >> > removing items for particular users has been working fine for us and
> >> > recommendations are being changed on the fly.. though it's still in
> >> beta...
> >> > The last piece is to remove an item from ALL users, it's necessary
> when
> >> the
> >> > item is removed from the system... So far I've got this:
> >> >
> >> > public void removeItemFromAllUsers(long itemId) throws
> >> NoSuchItemException {
> >> >        PreferenceArray prefArray = ((GenericDataModel)
> >> > delegate).getPreferencesForItem(itemId);
> >> >
> >> >        if (prefArray.length() > 0) {
> >> >            FastByIDMap<PreferenceArray> rawData = ((GenericDataModel)
> >> > delegate).getRawUserData();
> >> >
> >> >            for (Preference pr : prefArray) {
> >> >                PreferenceArray prefs = rawData.get(pr.getUserID());
> >> >
> >> >                rawData.remove(pr.getUserID());
> >> >
> >> >                if (prefs.length() > 1) {
> >> >                    PreferenceArray newPrefs = new
> >> > GenericUserPreferenceArray(length - 1);
> >> >                    for (int i = 0, j = 0; i < length; i++, j++) {
> >> >                        if (prefs.getItemID(i) == itemId) {
> >> >                            j--;
> >> >                        } else {
> >> >                            newPrefs.set(j, prefs.get(i));
> >> >                        }
> >> >                    }
> >> >                    rawData.put(pr.getUserID(), newPrefs);
> >> >                }
> >> >            }
> >> >            delegate = new GenericDataModel(rawData);
> >> >        }
> >> >    }
> >> >
> >> > I'm still pretty new with mahout and was wondering if there is a
> better
> >> way,
> >> > in particular make use of getRawUserData(), which I tried but it
> didn't
> >> > help...
> >> >
> >> > Thank you in advance.
> >> >
> >>
> >
>

Re: Removing an item from all users

Posted by Sean Owen <sr...@gmail.com>.
Yes, remove the bad items and prefs from your backing store. That gets
updated then next time you refresh(). Eventually, you want those gone,
it seems, since they're not useful information.

Among other things, the Rescorer is there to filter out items that are
useful for the internal CF algorithm, but not a valid recommendation
(e.g. out-of-stock items). But it could be slightly abused to filter
out bad data on the fly, on the front end.

You don't need to or want to attach additional info to those user-item
associations. Presumably the bad-ness is a property of the item. So
you just need a Rescorer that knows a list of bad item IDs, and
filters them.

On Mon, Oct 17, 2011 at 1:47 PM, Octavian Covalschi
<oc...@gmail.com> wrote:
> I understand. Even though I don't have feed files, we're using MongoDB as
> the main data source, I can update database for the "old" or "out of stock"
> items...
>
> So even in the case of ghost items or those I want to exclude and use
> Rescorer, I still need to update the in memory model, right? As far I
> understand it's not enough to update the feed, unless It's being
> refreshed... Is there a way to store additional optional data along with
> userId <-> itemId? The only solution I see now is to extend
> GenericPreference and add additional properties like isOld, price (for other
> purposes), is this a correct approach?
>
> PS: Thanks guys for your time!
>
> On Mon, Oct 17, 2011 at 2:35 AM, Sean Owen <sr...@gmail.com> wrote:
>
>> In general this DataModel is not modifiable, which is why you're
>> having to hack at it. The idea is to update the underlying data,
>> ideally -- for example, the file containing the data, perhaps by using
>> update (delta) files.
>>
>> Your approach works though. There is not a better way to do it, if
>> this is what you're doing.
>>
>> As Ted mentioned, keep data on items that are real and valid, just out
>> of stock or old or something. You can filter them using a Rescorer so
>> that they don't end up in the output.
>>
>> You can do the same thing for ghost data, if you like. That at least
>> means you have an immediate way of filtering and can wait for a data
>> reload or update rather than do it every time on the fly.
>>
>> On Mon, Oct 17, 2011 at 2:58 AM, Octavian Covalschi
>> <oc...@gmail.com> wrote:
>> > Hi there.
>> >
>> > We have the taste war running and we are trying to accomplish a realtime
>> > functionality by removing/adding user items on the fly. So far adding and
>> > removing items for particular users has been working fine for us and
>> > recommendations are being changed on the fly.. though it's still in
>> beta...
>> > The last piece is to remove an item from ALL users, it's necessary when
>> the
>> > item is removed from the system... So far I've got this:
>> >
>> > public void removeItemFromAllUsers(long itemId) throws
>> NoSuchItemException {
>> >        PreferenceArray prefArray = ((GenericDataModel)
>> > delegate).getPreferencesForItem(itemId);
>> >
>> >        if (prefArray.length() > 0) {
>> >            FastByIDMap<PreferenceArray> rawData = ((GenericDataModel)
>> > delegate).getRawUserData();
>> >
>> >            for (Preference pr : prefArray) {
>> >                PreferenceArray prefs = rawData.get(pr.getUserID());
>> >
>> >                rawData.remove(pr.getUserID());
>> >
>> >                if (prefs.length() > 1) {
>> >                    PreferenceArray newPrefs = new
>> > GenericUserPreferenceArray(length - 1);
>> >                    for (int i = 0, j = 0; i < length; i++, j++) {
>> >                        if (prefs.getItemID(i) == itemId) {
>> >                            j--;
>> >                        } else {
>> >                            newPrefs.set(j, prefs.get(i));
>> >                        }
>> >                    }
>> >                    rawData.put(pr.getUserID(), newPrefs);
>> >                }
>> >            }
>> >            delegate = new GenericDataModel(rawData);
>> >        }
>> >    }
>> >
>> > I'm still pretty new with mahout and was wondering if there is a better
>> way,
>> > in particular make use of getRawUserData(), which I tried but it didn't
>> > help...
>> >
>> > Thank you in advance.
>> >
>>
>

Re: Removing an item from all users

Posted by Octavian Covalschi <oc...@gmail.com>.
I understand. Even though I don't have feed files, we're using MongoDB as
the main data source, I can update database for the "old" or "out of stock"
items...

So even in the case of ghost items or those I want to exclude and use
Rescorer, I still need to update the in memory model, right? As far I
understand it's not enough to update the feed, unless It's being
refreshed... Is there a way to store additional optional data along with
userId <-> itemId? The only solution I see now is to extend
GenericPreference and add additional properties like isOld, price (for other
purposes), is this a correct approach?

PS: Thanks guys for your time!

On Mon, Oct 17, 2011 at 2:35 AM, Sean Owen <sr...@gmail.com> wrote:

> In general this DataModel is not modifiable, which is why you're
> having to hack at it. The idea is to update the underlying data,
> ideally -- for example, the file containing the data, perhaps by using
> update (delta) files.
>
> Your approach works though. There is not a better way to do it, if
> this is what you're doing.
>
> As Ted mentioned, keep data on items that are real and valid, just out
> of stock or old or something. You can filter them using a Rescorer so
> that they don't end up in the output.
>
> You can do the same thing for ghost data, if you like. That at least
> means you have an immediate way of filtering and can wait for a data
> reload or update rather than do it every time on the fly.
>
> On Mon, Oct 17, 2011 at 2:58 AM, Octavian Covalschi
> <oc...@gmail.com> wrote:
> > Hi there.
> >
> > We have the taste war running and we are trying to accomplish a realtime
> > functionality by removing/adding user items on the fly. So far adding and
> > removing items for particular users has been working fine for us and
> > recommendations are being changed on the fly.. though it's still in
> beta...
> > The last piece is to remove an item from ALL users, it's necessary when
> the
> > item is removed from the system... So far I've got this:
> >
> > public void removeItemFromAllUsers(long itemId) throws
> NoSuchItemException {
> >        PreferenceArray prefArray = ((GenericDataModel)
> > delegate).getPreferencesForItem(itemId);
> >
> >        if (prefArray.length() > 0) {
> >            FastByIDMap<PreferenceArray> rawData = ((GenericDataModel)
> > delegate).getRawUserData();
> >
> >            for (Preference pr : prefArray) {
> >                PreferenceArray prefs = rawData.get(pr.getUserID());
> >
> >                rawData.remove(pr.getUserID());
> >
> >                if (prefs.length() > 1) {
> >                    PreferenceArray newPrefs = new
> > GenericUserPreferenceArray(length - 1);
> >                    for (int i = 0, j = 0; i < length; i++, j++) {
> >                        if (prefs.getItemID(i) == itemId) {
> >                            j--;
> >                        } else {
> >                            newPrefs.set(j, prefs.get(i));
> >                        }
> >                    }
> >                    rawData.put(pr.getUserID(), newPrefs);
> >                }
> >            }
> >            delegate = new GenericDataModel(rawData);
> >        }
> >    }
> >
> > I'm still pretty new with mahout and was wondering if there is a better
> way,
> > in particular make use of getRawUserData(), which I tried but it didn't
> > help...
> >
> > Thank you in advance.
> >
>

Re: Removing an item from all users

Posted by Sean Owen <sr...@gmail.com>.
In general this DataModel is not modifiable, which is why you're
having to hack at it. The idea is to update the underlying data,
ideally -- for example, the file containing the data, perhaps by using
update (delta) files.

Your approach works though. There is not a better way to do it, if
this is what you're doing.

As Ted mentioned, keep data on items that are real and valid, just out
of stock or old or something. You can filter them using a Rescorer so
that they don't end up in the output.

You can do the same thing for ghost data, if you like. That at least
means you have an immediate way of filtering and can wait for a data
reload or update rather than do it every time on the fly.

On Mon, Oct 17, 2011 at 2:58 AM, Octavian Covalschi
<oc...@gmail.com> wrote:
> Hi there.
>
> We have the taste war running and we are trying to accomplish a realtime
> functionality by removing/adding user items on the fly. So far adding and
> removing items for particular users has been working fine for us and
> recommendations are being changed on the fly.. though it's still in beta...
> The last piece is to remove an item from ALL users, it's necessary when the
> item is removed from the system... So far I've got this:
>
> public void removeItemFromAllUsers(long itemId) throws NoSuchItemException {
>        PreferenceArray prefArray = ((GenericDataModel)
> delegate).getPreferencesForItem(itemId);
>
>        if (prefArray.length() > 0) {
>            FastByIDMap<PreferenceArray> rawData = ((GenericDataModel)
> delegate).getRawUserData();
>
>            for (Preference pr : prefArray) {
>                PreferenceArray prefs = rawData.get(pr.getUserID());
>
>                rawData.remove(pr.getUserID());
>
>                if (prefs.length() > 1) {
>                    PreferenceArray newPrefs = new
> GenericUserPreferenceArray(length - 1);
>                    for (int i = 0, j = 0; i < length; i++, j++) {
>                        if (prefs.getItemID(i) == itemId) {
>                            j--;
>                        } else {
>                            newPrefs.set(j, prefs.get(i));
>                        }
>                    }
>                    rawData.put(pr.getUserID(), newPrefs);
>                }
>            }
>            delegate = new GenericDataModel(rawData);
>        }
>    }
>
> I'm still pretty new with mahout and was wondering if there is a better way,
> in particular make use of getRawUserData(), which I tried but it didn't
> help...
>
> Thank you in advance.
>

Re: Removing an item from all users

Posted by Octavian Covalschi <oc...@gmail.com>.
Small correction: make use of *getRawItemData*

On Sun, Oct 16, 2011 at 8:58 PM, Octavian Covalschi <
octavian.covalschi@gmail.com> wrote:

> getRawUserData