You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Ankur Das <da...@gmail.com> on 2020/09/06 13:30:17 UTC

Query about Spark

Good Evening Sir/Madam,
Hope you are doing well, I am experimenting on some ML techniques where I
need to test it on a distributed environment.
For example a particular algorithm I want to run it on different nodes at
the same time and collect the results at the end in one single node or the
parent node.

So, I would like to know if it is possible or a good choice to use spark
for this.

Hope to hear from you soon, Stay safe and healthy
Thanking you in advance.
-- 
Regards,
Ankur J Das
Research Scholar @ Tezpur University
Tezpur, Assam

Re: Query about Spark

Posted by Ankur Das <da...@gmail.com>.
Thanks a lot.

On Mon, Sep 7, 2020 at 8:36 PM ☼ R Nair <ra...@gmail.com> wrote:

> Please read this as well, thanks
>
> Disclaimer: it's my article.
>
>
> https://medium.com/@ravishankar.nair/online-and-batch-based-ml-execution-from-same-python-code-preserving-pre-and-post-transformation-ea7ebc27f50f?sk=c33bcf1d6c28b562b7bd36fa39809294
>
> Best, Ravion
>
> On Mon, Sep 7, 2020, 8:29 AM Enrico Minack <ma...@enrico.minack.dev> wrote:
>
>> You could use Horovod to distribute your ML algorithm on a cluster, while
>> Horovod also supports Spark clusters.
>>
>> Enrico
>>
>>
>> Am 06.09.20 um 15:30 schrieb Ankur Das:
>>
>>
>> Good Evening Sir/Madam,
>> Hope you are doing well, I am experimenting on some ML techniques where I
>> need to test it on a distributed environment.
>> For example a particular algorithm I want to run it on different nodes at
>> the same time and collect the results at the end in one single node or the
>> parent node.
>>
>> So, I would like to know if it is possible or a good choice to use spark
>> for this.
>>
>> Hope to hear from you soon, Stay safe and healthy
>> Thanking you in advance.
>> --
>> Regards,
>> Ankur J Das
>> Research Scholar @ Tezpur University
>> Tezpur, Assam
>>
>>
>>

-- 
Regards,
Ankur J Das
Research Scholar @ Tezpur University
Tezpur, Assam

Re: Query about Spark

Posted by ☼ R Nair <ra...@gmail.com>.
Please read this as well, thanks

Disclaimer: it's my article.

https://medium.com/@ravishankar.nair/online-and-batch-based-ml-execution-from-same-python-code-preserving-pre-and-post-transformation-ea7ebc27f50f?sk=c33bcf1d6c28b562b7bd36fa39809294

Best, Ravion

On Mon, Sep 7, 2020, 8:29 AM Enrico Minack <ma...@enrico.minack.dev> wrote:

> You could use Horovod to distribute your ML algorithm on a cluster, while
> Horovod also supports Spark clusters.
>
> Enrico
>
>
> Am 06.09.20 um 15:30 schrieb Ankur Das:
>
>
> Good Evening Sir/Madam,
> Hope you are doing well, I am experimenting on some ML techniques where I
> need to test it on a distributed environment.
> For example a particular algorithm I want to run it on different nodes at
> the same time and collect the results at the end in one single node or the
> parent node.
>
> So, I would like to know if it is possible or a good choice to use spark
> for this.
>
> Hope to hear from you soon, Stay safe and healthy
> Thanking you in advance.
> --
> Regards,
> Ankur J Das
> Research Scholar @ Tezpur University
> Tezpur, Assam
>
>
>

Re: Query about Spark

Posted by Enrico Minack <ma...@Enrico.Minack.dev>.
You could use Horovod to distribute your ML algorithm on a cluster, 
while Horovod also supports Spark clusters.

Enrico


Am 06.09.20 um 15:30 schrieb Ankur Das:
>
> Good Evening Sir/Madam,
> Hope you are doing well, I am experimenting on some ML techniques 
> where I need to test it on a distributed environment.
> For example a particular algorithm I want to run it on different nodes 
> at the same time and collect the results at the end in one single node 
> or the parent node.
>
> So, I would like to know if it is possible or a good choice to use 
> spark for this.
>
> Hope to hear from you soon, Stay safe and healthy
> Thanking you in advance.
> -- 
> Regards,
> Ankur J Das
> Research Scholar @ Tezpur University
> Tezpur, Assam



Re: Query about Spark

Posted by Ankur Das <da...@gmail.com>.
Thanks, I'll check it out.

On Sun, Sep 6, 2020 at 7:15 PM ☼ R Nair <ra...@gmail.com> wrote:

> Or use MLFlow's PySpark UDF. First create a mlflow.pyfunc.
>
> Best, Ravion
>
> On Sun, Sep 6, 2020, 9:43 AM ☼ R Nair <ra...@gmail.com> wrote:
>
>> Question is not clear..use accumulators, if I took it correctly.
>>
>> Best, Ravion
>>
>> On Sun, Sep 6, 2020, 9:41 AM Ankur Das <da...@gmail.com> wrote:
>>
>>>
>>> Good Evening Sir/Madam,
>>> Hope you are doing well, I am experimenting on some ML techniques where
>>> I need to test it on a distributed environment.
>>> For example a particular algorithm I want to run it on different nodes
>>> at the same time and collect the results at the end in one single node or
>>> the parent node.
>>>
>>> So, I would like to know if it is possible or a good choice to use spark
>>> for this.
>>>
>>> Hope to hear from you soon, Stay safe and healthy
>>> Thanking you in advance.
>>> --
>>> Regards,
>>> Ankur J Das
>>> Research Scholar @ Tezpur University
>>> Tezpur, Assam
>>>
>>

-- 
Regards,
Ankur J Das
Research Scholar @ Tezpur University
Tezpur, Assam

Re: Query about Spark

Posted by ☼ R Nair <ra...@gmail.com>.
Or use MLFlow's PySpark UDF. First create a mlflow.pyfunc.

Best, Ravion

On Sun, Sep 6, 2020, 9:43 AM ☼ R Nair <ra...@gmail.com> wrote:

> Question is not clear..use accumulators, if I took it correctly.
>
> Best, Ravion
>
> On Sun, Sep 6, 2020, 9:41 AM Ankur Das <da...@gmail.com> wrote:
>
>>
>> Good Evening Sir/Madam,
>> Hope you are doing well, I am experimenting on some ML techniques where I
>> need to test it on a distributed environment.
>> For example a particular algorithm I want to run it on different nodes at
>> the same time and collect the results at the end in one single node or the
>> parent node.
>>
>> So, I would like to know if it is possible or a good choice to use spark
>> for this.
>>
>> Hope to hear from you soon, Stay safe and healthy
>> Thanking you in advance.
>> --
>> Regards,
>> Ankur J Das
>> Research Scholar @ Tezpur University
>> Tezpur, Assam
>>
>

Re: Query about Spark

Posted by ☼ R Nair <ra...@gmail.com>.
Question is not clear..use accumulators, if I took it correctly.

Best, Ravion

On Sun, Sep 6, 2020, 9:41 AM Ankur Das <da...@gmail.com> wrote:

>
> Good Evening Sir/Madam,
> Hope you are doing well, I am experimenting on some ML techniques where I
> need to test it on a distributed environment.
> For example a particular algorithm I want to run it on different nodes at
> the same time and collect the results at the end in one single node or the
> parent node.
>
> So, I would like to know if it is possible or a good choice to use spark
> for this.
>
> Hope to hear from you soon, Stay safe and healthy
> Thanking you in advance.
> --
> Regards,
> Ankur J Das
> Research Scholar @ Tezpur University
> Tezpur, Assam
>