You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Karl Wettin <ka...@gmail.com> on 2009/06/12 12:36:52 UTC
Tastify
Hi all,
I'm experimenting with Mahout by connecting it to Spotify <http://spotify.com/
>, a service that streams music on the net. It would be really cool
if you people could help me with a a ground truth. The data will of
course be released to the public.
At first I tried to use playlists I scraped from the net as
recommendation profiles. I'm not sure if it is my thesis that a user
always likes everything they put in their own small and non
collaborative playlists was wrong or if there was something else that
made that strategy fail, so I have started from scratch with only real
user preferences. For now it's something like 10 users and 500
preferences, so don't expect it to produce great results.
http://tastify.kodapan.se:8081/
It will register your account the first to you login, and it will show
you in clear text what password you choose, so choose something silly.
You use plain text queries or Spotify URIs in the search form. Start
with spotify:user:karl.wettin:playlist:1LOXpeOdStzoavRodI4zXZ to
connect with an already existing neighborhood. But please also try to
add some ratings to tracks not available in that playlist, preferably
some 10+.
Finally hit "Our recommendations" in order to get some results.
I have a handful of invites to Spotify if you don't have an account.
Not needed to use the Tastify service though, only if you want to play
the music.
Beware of the GUI. Lots of bugs, please report them if you see them.
The serivce will go up and down now and then. Try to relogin if you
get an exception.
karl
Re: Tastify
Posted by Ted Dunning <te...@gmail.com>.
Hierarchical modeling techniques work well on structures like this if you
have good resolution of your meta-data. Resolving and disambiguating artist
and track names can be difficult unless you have total control over the
meta-data source.
The basic idea is that you model an artist as a distribution over "concept
space", which is just a fancy name for latent variables you don't plan to
understnad. then an album is sampled from the artists and is another
distribution and finally a track is sampled from the album. This is similar
to the way that in LDA, documents and words are distributions over your
latent concept variables. Specific meanings are chosen at each point in a
document and the word you observe is chosen based on the concept at that
point.
Since you only observe which word appears in which document, you have to
reverse-engineer what the latent concepts might have been by getting a
compromise between the word and document distributions.
In your case, you have a simpler generative model, but similar techniques
should apply.
On Sat, Jun 13, 2009 at 8:53 AM, Karl Wettin <ka...@gmail.com> wrote:
>
> I hope that some semi-sophisticated Album, Track and ArtistSimilarity can
> be used to improve the results.
>
> Perhaps it's a good idea to have Playlist, Album and Artist implemented as
> Item too.
--
Ted Dunning, CTO
DeepDyve
Re: Tastify
Posted by Karl Wettin <ka...@gmail.com>.
12 jun 2009 kl. 16.15 skrev Ted Dunning:
> On Fri, Jun 12, 2009 at 3:36 AM, Karl Wettin <ka...@gmail.com>
> wrote:
>
>> At first I tried to use playlists I scraped from the net as
>> recommendation
>> profiles.
>
> Do you have raw play events, or do you have progress events as well?
>
> The single biggest improvement you can make with this kind of system
> is to
> quantify engagement somehow. Play starts are often a very poor
> surrogate
> for preference while more engaged events such as 30 second progress
> can be
> much better (or not, music consumption can be a bit strange).
No events, at least not for now.
I do however have a rather nice domain model to navigate:
>
> /---------\
> | |
> | +similarArtists
> | |
> | V*
> \------[Artist]--------\
> / |1 |
> [Genre]<----/ | |
> * | |*
> *| V
> [Item]<|- - -[Track]------[Album]
> ^* *
> |
> |
> |
> [Playlist]
>
(It's supposed to be an UML class diagram.)
I hope that some semi-sophisticated Album, Track and ArtistSimilarity
can be used to improve the results.
Perhaps it's a good idea to have Playlist, Album and Artist
implemented as Item too.
karl
Re: Tastify
Posted by Ted Dunning <te...@gmail.com>.
Karl,
Do you have raw play events, or do you have progress events as well?
The single biggest improvement you can make with this kind of system is to
quantify engagement somehow. Play starts are often a very poor surrogate
for preference while more engaged events such as 30 second progress can be
much better (or not, music consumption can be a bit strange).
On Fri, Jun 12, 2009 at 3:36 AM, Karl Wettin <ka...@gmail.com> wrote:
> At first I tried to use playlists I scraped from the net as recommendation
> profiles.