You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@spark.apache.org by "Ulanov, Alexander" <al...@hpe.com> on 2016/12/14 00:57:52 UTC

Belief propagation algorithm is open sourced

Dear Spark developers and users,


HPE has open sourced the implementation of the belief propagation (BP) algorithm for Apache Spark, a popular message passing algorithm for performing inference in probabilistic graphical models. It provides exact inference for graphical models without loops. While inference for graphical models with loops is approximate, in practice it is shown to work well. The implementation is generic and operates on factor graph representation of graphical models. It handles factors of any order, and variable domains of any size. It is implemented with Apache Spark GraphX, and thus can scale to large scale models. Further, it supports computations in log scale for numerical stability. Large scale applications of BP include fraud detection in banking transactions and malicious site detection in computer networks.


Source code: https://github.com/HewlettPackard/sandpiper


Best regards, Alexander

Re: Belief propagation algorithm is open sourced

Posted by "Ulanov, Alexander" <al...@hpe.com>.

Hi Bertrand,


We only do inference. We do not do structure or parameter estimation (or learning) - that for the MRF would be estimation of the factors, and the structure of the graphical model. The parameters can be estimated using maximum likelihood if data is available for all the nodes, or by EM if there are hidden nodes. We of course don't implement MLE, or EM.


Assuming the model parameters are already available, we can do inference for both Bayesian and Markov models.

So to answer the question below, we don't do "learning", we do "inference" using BP.

We were using both LibDAI and our own implementation of BP for GraphLab and as a reference.


Best regards, Manish Marwah & Alexander

________________________________
From: Bertrand Dechoux <de...@gmail.com>
Sent: Thursday, December 15, 2016 1:03:49 AM
To: Bryan Cutler
Cc: Ulanov, Alexander; user; dev
Subject: Re: Belief propagation algorithm is open sourced

Nice! I am especially interested in Bayesian Networks, which are only one of the many models that can be expressed by a factor graph representation. Do you do Bayesian Networks learning at scale (parameters and structure) with latent variables? Are you using publicly available tools for that? Which ones?

LibDAI, which created the supported format, "supports parameter learning of conditional probability tables by Expectation Maximization" according to the documentation. Is it your reference tool?

Bertrand

On Thu, Dec 15, 2016 at 5:21 AM, Bryan Cutler <cu...@gmail.com>> wrote:
I'll check it out, thanks for sharing Alexander!

On Dec 13, 2016 4:58 PM, "Ulanov, Alexander" <al...@hpe.com>> wrote:

Dear Spark developers and users,


HPE has open sourced the implementation of the belief propagation (BP) algorithm for Apache Spark, a popular message passing algorithm for performing inference in probabilistic graphical models. It provides exact inference for graphical models without loops. While inference for graphical models with loops is approximate, in practice it is shown to work well. The implementation is generic and operates on factor graph representation of graphical models. It handles factors of any order, and variable domains of any size. It is implemented with Apache Spark GraphX, and thus can scale to large scale models. Further, it supports computations in log scale for numerical stability. Large scale applications of BP include fraud detection in banking transactions and malicious site detection in computer networks.


Source code: https://github.com/HewlettPackard/sandpiper


Best regards, Alexander

Re: Belief propagation algorithm is open sourced

Posted by "Ulanov, Alexander" <al...@hpe.com>.

Hi Bertrand,


We only do inference. We do not do structure or parameter estimation (or learning) - that for the MRF would be estimation of the factors, and the structure of the graphical model. The parameters can be estimated using maximum likelihood if data is available for all the nodes, or by EM if there are hidden nodes. We of course don't implement MLE, or EM.


Assuming the model parameters are already available, we can do inference for both Bayesian and Markov models.

So to answer the question below, we don't do "learning", we do "inference" using BP.

We were using both LibDAI and our own implementation of BP for GraphLab and as a reference.


Best regards, Manish Marwah & Alexander

________________________________
From: Bertrand Dechoux <de...@gmail.com>
Sent: Thursday, December 15, 2016 1:03:49 AM
To: Bryan Cutler
Cc: Ulanov, Alexander; user; dev
Subject: Re: Belief propagation algorithm is open sourced

Nice! I am especially interested in Bayesian Networks, which are only one of the many models that can be expressed by a factor graph representation. Do you do Bayesian Networks learning at scale (parameters and structure) with latent variables? Are you using publicly available tools for that? Which ones?

LibDAI, which created the supported format, "supports parameter learning of conditional probability tables by Expectation Maximization" according to the documentation. Is it your reference tool?

Bertrand

On Thu, Dec 15, 2016 at 5:21 AM, Bryan Cutler <cu...@gmail.com>> wrote:
I'll check it out, thanks for sharing Alexander!

On Dec 13, 2016 4:58 PM, "Ulanov, Alexander" <al...@hpe.com>> wrote:

Dear Spark developers and users,


HPE has open sourced the implementation of the belief propagation (BP) algorithm for Apache Spark, a popular message passing algorithm for performing inference in probabilistic graphical models. It provides exact inference for graphical models without loops. While inference for graphical models with loops is approximate, in practice it is shown to work well. The implementation is generic and operates on factor graph representation of graphical models. It handles factors of any order, and variable domains of any size. It is implemented with Apache Spark GraphX, and thus can scale to large scale models. Further, it supports computations in log scale for numerical stability. Large scale applications of BP include fraud detection in banking transactions and malicious site detection in computer networks.


Source code: https://github.com/HewlettPackard/sandpiper


Best regards, Alexander

Re: Belief propagation algorithm is open sourced

Posted by Bertrand Dechoux <de...@gmail.com>.

Nice! I am especially interested in Bayesian Networks, which are only one
of the many models that can be expressed by a factor graph representation.
Do you do Bayesian Networks learning at scale (parameters and structure)
with latent variables? Are you using publicly available tools for that?
Which ones?

LibDAI, which created the supported format, "supports parameter learning of
conditional probability tables by Expectation Maximization" according to
the documentation. Is it your reference tool?

Bertrand

On Thu, Dec 15, 2016 at 5:21 AM, Bryan Cutler <cu...@gmail.com> wrote:

> I'll check it out, thanks for sharing Alexander!
>
> On Dec 13, 2016 4:58 PM, "Ulanov, Alexander" <al...@hpe.com>
> wrote:
>
>> Dear Spark developers and users,
>>
>>
>> HPE has open sourced the implementation of the belief propagation (BP)
>> algorithm for Apache Spark, a popular message passing algorithm for
>> performing inference in probabilistic graphical models. It provides exact
>> inference for graphical models without loops. While inference for graphical
>> models with loops is approximate, in practice it is shown to work well. The
>> implementation is generic and operates on factor graph representation of
>> graphical models. It handles factors of any order, and variable domains of
>> any size. It is implemented with Apache Spark GraphX, and thus can scale to
>> large scale models. Further, it supports computations in log scale for
>> numerical stability. Large scale applications of BP include fraud detection
>> in banking transactions and malicious site detection in computer
>> networks.
>>
>>
>> Source code: https://github.com/HewlettPackard/sandpiper
>>
>>
>> Best regards, Alexander
>>
>

Re: Belief propagation algorithm is open sourced

Posted by Bertrand Dechoux <de...@gmail.com>.

Nice! I am especially interested in Bayesian Networks, which are only one
of the many models that can be expressed by a factor graph representation.
Do you do Bayesian Networks learning at scale (parameters and structure)
with latent variables? Are you using publicly available tools for that?
Which ones?

LibDAI, which created the supported format, "supports parameter learning of
conditional probability tables by Expectation Maximization" according to
the documentation. Is it your reference tool?

Bertrand

On Thu, Dec 15, 2016 at 5:21 AM, Bryan Cutler <cu...@gmail.com> wrote:

> I'll check it out, thanks for sharing Alexander!
>
> On Dec 13, 2016 4:58 PM, "Ulanov, Alexander" <al...@hpe.com>
> wrote:
>
>> Dear Spark developers and users,
>>
>>
>> HPE has open sourced the implementation of the belief propagation (BP)
>> algorithm for Apache Spark, a popular message passing algorithm for
>> performing inference in probabilistic graphical models. It provides exact
>> inference for graphical models without loops. While inference for graphical
>> models with loops is approximate, in practice it is shown to work well. The
>> implementation is generic and operates on factor graph representation of
>> graphical models. It handles factors of any order, and variable domains of
>> any size. It is implemented with Apache Spark GraphX, and thus can scale to
>> large scale models. Further, it supports computations in log scale for
>> numerical stability. Large scale applications of BP include fraud detection
>> in banking transactions and malicious site detection in computer
>> networks.
>>
>>
>> Source code: https://github.com/HewlettPackard/sandpiper
>>
>>
>> Best regards, Alexander
>>
>

Re: Belief propagation algorithm is open sourced

Posted by Bryan Cutler <cu...@gmail.com>.

I'll check it out, thanks for sharing Alexander!

On Dec 13, 2016 4:58 PM, "Ulanov, Alexander" <al...@hpe.com>
wrote:

> Dear Spark developers and users,
>
>
> HPE has open sourced the implementation of the belief propagation (BP)
> algorithm for Apache Spark, a popular message passing algorithm for
> performing inference in probabilistic graphical models. It provides exact
> inference for graphical models without loops. While inference for graphical
> models with loops is approximate, in practice it is shown to work well. The
> implementation is generic and operates on factor graph representation of
> graphical models. It handles factors of any order, and variable domains of
> any size. It is implemented with Apache Spark GraphX, and thus can scale to
> large scale models. Further, it supports computations in log scale for
> numerical stability. Large scale applications of BP include fraud detection
> in banking transactions and malicious site detection in computer networks.
>
>
> Source code: https://github.com/HewlettPackard/sandpiper
>
>
> Best regards, Alexander
>

Re: Belief propagation algorithm is open sourced

Posted by Bryan Cutler <cu...@gmail.com>.

I'll check it out, thanks for sharing Alexander!

On Dec 13, 2016 4:58 PM, "Ulanov, Alexander" <al...@hpe.com>
wrote:

> Dear Spark developers and users,
>
>
> HPE has open sourced the implementation of the belief propagation (BP)
> algorithm for Apache Spark, a popular message passing algorithm for
> performing inference in probabilistic graphical models. It provides exact
> inference for graphical models without loops. While inference for graphical
> models with loops is approximate, in practice it is shown to work well. The
> implementation is generic and operates on factor graph representation of
> graphical models. It handles factors of any order, and variable domains of
> any size. It is implemented with Apache Spark GraphX, and thus can scale to
> large scale models. Further, it supports computations in log scale for
> numerical stability. Large scale applications of BP include fraud detection
> in banking transactions and malicious site detection in computer networks.
>
>
> Source code: https://github.com/HewlettPackard/sandpiper
>
>
> Best regards, Alexander
>