You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Robert Kaye <ro...@metabrainz.org> on 2019/02/19 12:26:16 UTC

Looking for an apache spark mentor

Hello!

I’m Robert Kaye from the MetaBrainz Foundation — we’re the people behind MusicBrainz ( https://musicbrainz.org <https://musicbrainz.org/> ) and more recently ListenBrainz ( https://listenbrainz.org <https://listenbrainz.org/> ). ListenBrainz is aiming to re-create what last.fm <http://last.fm/> used to be — we’ve already got 200M listens (AKA scrabbles) from our users (which is not a lot, really). We’ve setup an Apache Spark cluster and are starting to build user listening statistics using this setup.

While our setup is working, we can see that we’re not going to scale up well given our current approach. We’ve been trying to read the docs, ask for help on the IRC channel, but we continue to miss import bits about how we should be doing things. Best practices around Spark seem to be hard to come by. :(

MetaBrainz is all open source and open data — any of the data we use is available for anyone to download — we’re a non-profit working hard towards creating open source music recommendation engines. We’re hoping that someone could take us under their wing, turn up in our IRC channel and help us find the right path towards using Spark much more effectively than we’ve been so far.

Is anyone on this list interested in helping out? Perhaps you know someone who might?

Thanks!

--

--ruaok        

Robert Kaye     --     rob@metabrainz.org     --    http://metabrainz.org


Re: Looking for an apache spark mentor

Posted by Robert Kaye <ro...@metabrainz.org>.

> On Feb 19, 2019, at 2:26 PM, Shyam P <sh...@gmail.com> wrote:
> 
> What IRC channel we should join?

I should’ve included info in the first place, heh. Sorry:

#metabrainz on freenode, please.

I am ruaok, but pristine and iliekcomputers are also very much interested in learning more about Spark.

Thanks!

--

--ruaok        

Robert Kaye     --     rob@metabrainz.org     --    http://metabrainz.org


Re: Looking for an apache spark mentor

Posted by Shyam P <sh...@gmail.com>.
What IRC channel we should join?

On Tue, 19 Feb 2019, 17:56 Robert Kaye, <ro...@metabrainz.org> wrote:

> Hello!
>
> I’m Robert Kaye from the MetaBrainz Foundation — we’re the people behind
> MusicBrainz ( https://musicbrainz.org ) and more recently ListenBrainz (
> https://listenbrainz.org ). ListenBrainz is aiming to re-create what
> last.fm used to be — we’ve already got 200M listens (AKA scrabbles) from
> our users (which is not a lot, really). We’ve setup an Apache Spark cluster
> and are starting to build user listening statistics using this setup.
>
> While our setup is working, we can see that we’re not going to scale up
> well given our current approach. We’ve been trying to read the docs, ask
> for help on the IRC channel, but we continue to miss import bits about how
> we should be doing things. Best practices around Spark seem to be hard to
> come by. :(
>
> MetaBrainz is all open source and open data — any of the data we use is
> available for anyone to download — we’re a non-profit working hard towards
> creating open source music recommendation engines. We’re hoping that
> someone could take us under their wing, turn up in our IRC channel and help
> us find the right path towards using Spark much more effectively than we’ve
> been so far.
>
> Is anyone on this list interested in helping out? Perhaps you know someone
> who might?
>
> Thanks!
>
> --
>
> --ruaok
>
> Robert Kaye     --     rob@metabrainz.org     --    http://metabrainz.org
>
>