You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Mina Aslani <as...@gmail.com> on 2017/02/24 02:19:47 UTC

Apache Spark MLIB

Hi,

I am going to start working on anomaly detection using Spark MLIB. Please
note that I have not used Spark so far.

I would like to read some data and if a user logged in from different ip
address which is not common consider it as an anomaly, similar to what
apple/google does.

My preferred language of programming is JAVA.

I am wondering if you can let me know about books/workshops which guide me
on the ML algorithm to use and how to implement. I would like to know about
the Spark supervised/unsupervised options and the suggested algorithm.

I really appreciate if you share you thoughts/experience/insight with me.

Best regards,
Mina

Re: Apache Spark MLIB

Posted by Jon Gregg <co...@gmail.com>.
Here's a high level overview of Spark's ML Pipelines around when it came
out: https://www.youtube.com/watch?v=OednhGRp938.

But reading your description, you might be able to build a basic version of
this without ML.  Spark has broadcast variables
<http://spark.apache.org/docs/latest/programming-guide.html#broadcast-variables>
that
would allow you to put flagged IP ranges into an array and make that
available on every node.  Then you can filters to detect users who've
logged in from a flagged IP range.

Jon Gregg

On Thu, Feb 23, 2017 at 9:19 PM, Mina Aslani <as...@gmail.com> wrote:

> Hi,
>
> I am going to start working on anomaly detection using Spark MLIB. Please
> note that I have not used Spark so far.
>
> I would like to read some data and if a user logged in from different ip
> address which is not common consider it as an anomaly, similar to what
> apple/google does.
>
> My preferred language of programming is JAVA.
>
> I am wondering if you can let me know about books/workshops which guide me
> on the ML algorithm to use and how to implement. I would like to know about
> the Spark supervised/unsupervised options and the suggested algorithm.
>
> I really appreciate if you share you thoughts/experience/insight with me.
>
> Best regards,
> Mina
>