You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@systemml.apache.org by Niketan Pansare <np...@us.ibm.com> on 2015/12/03 23:32:21 UTC

Open tasks: Integration with MLPipeline


Hi all,

In this email, I list the open tasks related to integration with
MLPipeline. This allows external developers to contribute to the SystemML
project until our JIRA server is up and running.

1. Make the existing Logistic regression wrapper more robust:
- Extend the wrapper or the DML script to handle zero-based labels (either
throw an error or support zero-based labels).

2. Improve the performance of the Logistic regression wrapper:
- Profile the wrapper to find potential bottlenecks. The candidates for
bottlenecks are RDDConverterUtilsExt.vectorDataFrameToBinaryBlock and line
153-158 in LogisticRegressionModel.

3.  Perform detailed performance analysis of the converter utils.
- Also explore the usability aspect of these utils.

4. Add MLPipeline wrappers for existing scripts.
- Refer to
https://github.com/apache/incubator-systemml/tree/master/scripts/algorithms
to pick the algorithm and
http://apache.github.io/incubator-systemml/algorithms-reference.html to
understand the assumptions as well as parameters to the given algorithm.
- A good algorithm to start with is L2SVM:
http://apache.github.io/incubator-systemml/algorithms-classification.html#binary-class-support-vector-machines
https://github.com/apache/incubator-systemml/blob/master/scripts/algorithms/l2-svm.dml
https://github.com/apache/incubator-systemml/blob/master/scripts/algorithms/l2-svm-predict.dml

5. Add the documentation for MLPipeline wrappers to
http://apache.github.io/incubator-systemml/index.html

References:
1. Existing Logistic regression wrappers:
https://github.com/apache/incubator-systemml/blob/master/src/main/java/org/apache/sysml/api/ml/LogisticRegression.java
https://github.com/apache/incubator-systemml/blob/master/src/main/java/org/apache/sysml/api/ml/LogisticRegressionModel.java

2. Converter utils:
https://github.com/apache/incubator-systemml/blob/master/src/main/java/org/apache/sysml/runtime/instructions/spark/utils/RDDConverterUtilsExt.java

Thanks,

Niketan Pansare
IBM Almaden Research Center
E-mail: npansar At us.ibm.com
http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar

Re: Open tasks: Integration with MLPipeline

Posted by Tsuyoshi Ozawa <oz...@apache.org>.
Hi Deron,

Oh, I missed that the JIRA is still work in progress. Thank you for
the notification.

Thanks,
- Tsuyoshi

On Thu, Dec 17, 2015 at 10:40 AM, Deron Eriksson
<de...@gmail.com> wrote:
> Hi Tsuyoshi,
>
> We are still having some issues getting our JIRA project data imported into
> Apache JIRA. Please see: https://issues.apache.org/jira/browse/INFRA-10714
>
> We are hoping that the 3 missing fields will import correctly very soon. If
> not, we will manually handle the issue so that we can begin using JIRA
> again to track our issues. Thank you for your patience!
>
> Deron
>
>
> On Wed, Dec 16, 2015 at 5:16 PM, Tsuyoshi Ozawa <
> ozawa.tsuyoshi@lab.ntt.co.jp> wrote:
>
>> Hi Niketan,
>>
>> The jira for SystemML seems to be open now:
>> https://issues.apache.org/jira/browse/SYSTEMML
>>
>> Do you mind creating issue for the tasks? We can avoid conflicts of
>> assignee
>> by using JIRA.
>>
>> Thanks,
>> - Tsuyoshi
>>
>>
>> -----Original Message-----
>> From: Glenn Weidner [mailto:gweidner@us.ibm.com]
>> Sent: Tuesday, December 15, 2015 5:28 AM
>> To: dev@systemml.incubator.apache.org
>> Cc: npansar@us.ibm.com
>> Subject: Re: Open tasks: Integration with MLPipeline
>>
>> Hi,
>>
>> I'm interested in working on item 4:
>>
>> 4. Add MLPipeline wrappers for existing scripts.
>> - Refer to
>> https://github.com/apache/incubator-systemml/tree/master/scripts/algorithms
>> to pick the algorithm and
>> http://apache.github.io/incubator-systemml/algorithms-reference.html to
>> understand the assumptions as well as parameters to the given algorithm.
>> - A good algorithm to start with is L2SVM:
>>
>> http://apache.github.io/incubator-systemml/algorithms-classification.html#bi
>> nary-class-support-vector-machines
>>
>> https://github.com/apache/incubator-systemml/blob/master/scripts/algorithms/
>> l2-svm.dml
>>
>> https://github.com/apache/incubator-systemml/blob/master/scripts/algorithms/
>> l2-svm-predict.dml
>>
>> Thanks,
>> Glenn
>>
>>
>> Niketan Pansare---12/03/2015 02:32:50 PM---Hi all, In this email, I list
>> the
>> open tasks related to integration with
>>
>> From: Niketan Pansare/Almaden/IBM@IBMUS
>> To: dev@systemml.incubator.apache.org
>> Cc: "Tatsuya Nishiyama" <ni...@lab.ntt.co.jp>
>> Date: 12/03/2015 02:32 PM
>> Subject: Open tasks: Integration with MLPipeline
>>
>> ________________________________
>>
>>
>>
>>
>>
>>
>> Hi all,
>>
>> In this email, I list the open tasks related to integration with
>> MLPipeline.
>> This allows external developers to contribute to the SystemML project until
>> our JIRA server is up and running.
>>
>> 1. Make the existing Logistic regression wrapper more robust:
>> - Extend the wrapper or the DML script to handle zero-based labels (either
>> throw an error or support zero-based labels).
>>
>> 2. Improve the performance of the Logistic regression wrapper:
>> - Profile the wrapper to find potential bottlenecks. The candidates for
>> bottlenecks are RDDConverterUtilsExt.vectorDataFrameToBinaryBlock and line
>> 153-158 in LogisticRegressionModel.
>>
>> 3.  Perform detailed performance analysis of the converter utils.
>> - Also explore the usability aspect of these utils.
>>
>> 4. Add MLPipeline wrappers for existing scripts.
>> - Refer to
>> https://github.com/apache/incubator-systemml/tree/master/scripts/algorithms
>> to pick the algorithm and
>> http://apache.github.io/incubator-systemml/algorithms-reference.html to
>> understand the assumptions as well as parameters to the given algorithm.
>> - A good algorithm to start with is L2SVM:
>>
>> http://apache.github.io/incubator-systemml/algorithms-classification.html#bi
>> nary-class-support-vector-machines
>>
>> https://github.com/apache/incubator-systemml/blob/master/scripts/algorithms/
>> l2-svm.dml
>>
>> https://github.com/apache/incubator-systemml/blob/master/scripts/algorithms/
>> l2-svm-predict.dml
>>
>> 5. Add the documentation for MLPipeline wrappers to
>> http://apache.github.io/incubator-systemml/index.html
>>
>> References:
>> 1. Existing Logistic regression wrappers:
>>
>> https://github.com/apache/incubator-systemml/blob/master/src/main/java/org/a
>> pache/sysml/api/ml/LogisticRegression.java
>>
>> https://github.com/apache/incubator-systemml/blob/master/src/main/java/org/a
>> pache/sysml/api/ml/LogisticRegressionModel.java
>>
>> 2. Converter utils:
>>
>> https://github.com/apache/incubator-systemml/blob/master/src/main/java/org/a
>> pache/sysml/runtime/instructions/spark/utils/RDDConverterUtilsExt.java
>>
>> Thanks,
>>
>> Niketan Pansare
>> IBM Almaden Research Center
>> E-mail: npansar At us.ibm.com
>> http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar
>>
>>
>>
>>
>>
>>

Re: Open tasks: Integration with MLPipeline

Posted by Deron Eriksson <de...@gmail.com>.
Hi Tsuyoshi,

We are still having some issues getting our JIRA project data imported into
Apache JIRA. Please see: https://issues.apache.org/jira/browse/INFRA-10714

We are hoping that the 3 missing fields will import correctly very soon. If
not, we will manually handle the issue so that we can begin using JIRA
again to track our issues. Thank you for your patience!

Deron


On Wed, Dec 16, 2015 at 5:16 PM, Tsuyoshi Ozawa <
ozawa.tsuyoshi@lab.ntt.co.jp> wrote:

> Hi Niketan,
>
> The jira for SystemML seems to be open now:
> https://issues.apache.org/jira/browse/SYSTEMML
>
> Do you mind creating issue for the tasks? We can avoid conflicts of
> assignee
> by using JIRA.
>
> Thanks,
> - Tsuyoshi
>
>
> -----Original Message-----
> From: Glenn Weidner [mailto:gweidner@us.ibm.com]
> Sent: Tuesday, December 15, 2015 5:28 AM
> To: dev@systemml.incubator.apache.org
> Cc: npansar@us.ibm.com
> Subject: Re: Open tasks: Integration with MLPipeline
>
> Hi,
>
> I'm interested in working on item 4:
>
> 4. Add MLPipeline wrappers for existing scripts.
> - Refer to
> https://github.com/apache/incubator-systemml/tree/master/scripts/algorithms
> to pick the algorithm and
> http://apache.github.io/incubator-systemml/algorithms-reference.html to
> understand the assumptions as well as parameters to the given algorithm.
> - A good algorithm to start with is L2SVM:
>
> http://apache.github.io/incubator-systemml/algorithms-classification.html#bi
> nary-class-support-vector-machines
>
> https://github.com/apache/incubator-systemml/blob/master/scripts/algorithms/
> l2-svm.dml
>
> https://github.com/apache/incubator-systemml/blob/master/scripts/algorithms/
> l2-svm-predict.dml
>
> Thanks,
> Glenn
>
>
> Niketan Pansare---12/03/2015 02:32:50 PM---Hi all, In this email, I list
> the
> open tasks related to integration with
>
> From: Niketan Pansare/Almaden/IBM@IBMUS
> To: dev@systemml.incubator.apache.org
> Cc: "Tatsuya Nishiyama" <ni...@lab.ntt.co.jp>
> Date: 12/03/2015 02:32 PM
> Subject: Open tasks: Integration with MLPipeline
>
> ________________________________
>
>
>
>
>
>
> Hi all,
>
> In this email, I list the open tasks related to integration with
> MLPipeline.
> This allows external developers to contribute to the SystemML project until
> our JIRA server is up and running.
>
> 1. Make the existing Logistic regression wrapper more robust:
> - Extend the wrapper or the DML script to handle zero-based labels (either
> throw an error or support zero-based labels).
>
> 2. Improve the performance of the Logistic regression wrapper:
> - Profile the wrapper to find potential bottlenecks. The candidates for
> bottlenecks are RDDConverterUtilsExt.vectorDataFrameToBinaryBlock and line
> 153-158 in LogisticRegressionModel.
>
> 3.  Perform detailed performance analysis of the converter utils.
> - Also explore the usability aspect of these utils.
>
> 4. Add MLPipeline wrappers for existing scripts.
> - Refer to
> https://github.com/apache/incubator-systemml/tree/master/scripts/algorithms
> to pick the algorithm and
> http://apache.github.io/incubator-systemml/algorithms-reference.html to
> understand the assumptions as well as parameters to the given algorithm.
> - A good algorithm to start with is L2SVM:
>
> http://apache.github.io/incubator-systemml/algorithms-classification.html#bi
> nary-class-support-vector-machines
>
> https://github.com/apache/incubator-systemml/blob/master/scripts/algorithms/
> l2-svm.dml
>
> https://github.com/apache/incubator-systemml/blob/master/scripts/algorithms/
> l2-svm-predict.dml
>
> 5. Add the documentation for MLPipeline wrappers to
> http://apache.github.io/incubator-systemml/index.html
>
> References:
> 1. Existing Logistic regression wrappers:
>
> https://github.com/apache/incubator-systemml/blob/master/src/main/java/org/a
> pache/sysml/api/ml/LogisticRegression.java
>
> https://github.com/apache/incubator-systemml/blob/master/src/main/java/org/a
> pache/sysml/api/ml/LogisticRegressionModel.java
>
> 2. Converter utils:
>
> https://github.com/apache/incubator-systemml/blob/master/src/main/java/org/a
> pache/sysml/runtime/instructions/spark/utils/RDDConverterUtilsExt.java
>
> Thanks,
>
> Niketan Pansare
> IBM Almaden Research Center
> E-mail: npansar At us.ibm.com
> http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar
>
>
>
>
>
>

RE: Open tasks: Integration with MLPipeline

Posted by Tsuyoshi Ozawa <oz...@lab.ntt.co.jp>.
Hi Niketan,

The jira for SystemML seems to be open now:
https://issues.apache.org/jira/browse/SYSTEMML

Do you mind creating issue for the tasks? We can avoid conflicts of assignee
by using JIRA.

Thanks,
- Tsuyoshi


-----Original Message-----
From: Glenn Weidner [mailto:gweidner@us.ibm.com] 
Sent: Tuesday, December 15, 2015 5:28 AM
To: dev@systemml.incubator.apache.org
Cc: npansar@us.ibm.com
Subject: Re: Open tasks: Integration with MLPipeline

Hi,

I'm interested in working on item 4:

4. Add MLPipeline wrappers for existing scripts.
- Refer to
https://github.com/apache/incubator-systemml/tree/master/scripts/algorithms
to pick the algorithm and
http://apache.github.io/incubator-systemml/algorithms-reference.html to
understand the assumptions as well as parameters to the given algorithm.
- A good algorithm to start with is L2SVM:
http://apache.github.io/incubator-systemml/algorithms-classification.html#bi
nary-class-support-vector-machines
https://github.com/apache/incubator-systemml/blob/master/scripts/algorithms/
l2-svm.dml
https://github.com/apache/incubator-systemml/blob/master/scripts/algorithms/
l2-svm-predict.dml

Thanks,
Glenn


Niketan Pansare---12/03/2015 02:32:50 PM---Hi all, In this email, I list the
open tasks related to integration with

From: Niketan Pansare/Almaden/IBM@IBMUS
To: dev@systemml.incubator.apache.org
Cc: "Tatsuya Nishiyama" <ni...@lab.ntt.co.jp>
Date: 12/03/2015 02:32 PM
Subject: Open tasks: Integration with MLPipeline

________________________________






Hi all,

In this email, I list the open tasks related to integration with MLPipeline.
This allows external developers to contribute to the SystemML project until
our JIRA server is up and running.

1. Make the existing Logistic regression wrapper more robust:
- Extend the wrapper or the DML script to handle zero-based labels (either
throw an error or support zero-based labels).

2. Improve the performance of the Logistic regression wrapper:
- Profile the wrapper to find potential bottlenecks. The candidates for
bottlenecks are RDDConverterUtilsExt.vectorDataFrameToBinaryBlock and line
153-158 in LogisticRegressionModel.

3.  Perform detailed performance analysis of the converter utils.
- Also explore the usability aspect of these utils.

4. Add MLPipeline wrappers for existing scripts.
- Refer to
https://github.com/apache/incubator-systemml/tree/master/scripts/algorithms
to pick the algorithm and
http://apache.github.io/incubator-systemml/algorithms-reference.html to
understand the assumptions as well as parameters to the given algorithm.
- A good algorithm to start with is L2SVM:
http://apache.github.io/incubator-systemml/algorithms-classification.html#bi
nary-class-support-vector-machines
https://github.com/apache/incubator-systemml/blob/master/scripts/algorithms/
l2-svm.dml
https://github.com/apache/incubator-systemml/blob/master/scripts/algorithms/
l2-svm-predict.dml

5. Add the documentation for MLPipeline wrappers to
http://apache.github.io/incubator-systemml/index.html

References:
1. Existing Logistic regression wrappers:
https://github.com/apache/incubator-systemml/blob/master/src/main/java/org/a
pache/sysml/api/ml/LogisticRegression.java
https://github.com/apache/incubator-systemml/blob/master/src/main/java/org/a
pache/sysml/api/ml/LogisticRegressionModel.java

2. Converter utils:
https://github.com/apache/incubator-systemml/blob/master/src/main/java/org/a
pache/sysml/runtime/instructions/spark/utils/RDDConverterUtilsExt.java

Thanks,

Niketan Pansare
IBM Almaden Research Center
E-mail: npansar At us.ibm.com
http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar






Re: Open tasks: Integration with MLPipeline

Posted by Glenn Weidner <gw...@us.ibm.com>.
Hi,

I'm interested in working on item 4:

4. Add MLPipeline wrappers for existing scripts.
- Refer to
https://github.com/apache/incubator-systemml/tree/master/scripts/algorithms
to pick the algorithm and
http://apache.github.io/incubator-systemml/algorithms-reference.html to
understand the assumptions as well as parameters to the given algorithm.
- A good algorithm to start with is L2SVM:
http://apache.github.io/incubator-systemml/algorithms-classification.html#binary-class-support-vector-machines

https://github.com/apache/incubator-systemml/blob/master/scripts/algorithms/l2-svm.dml

https://github.com/apache/incubator-systemml/blob/master/scripts/algorithms/l2-svm-predict.dml


Thanks,
Glenn




From:	Niketan Pansare/Almaden/IBM@IBMUS
To:	dev@systemml.incubator.apache.org
Cc:	"Tatsuya Nishiyama" <ni...@lab.ntt.co.jp>
Date:	12/03/2015 02:32 PM
Subject:	Open tasks: Integration with MLPipeline





Hi all,

In this email, I list the open tasks related to integration with
MLPipeline. This allows external developers to contribute to the SystemML
project until our JIRA server is up and running.

1. Make the existing Logistic regression wrapper more robust:
- Extend the wrapper or the DML script to handle zero-based labels (either
throw an error or support zero-based labels).

2. Improve the performance of the Logistic regression wrapper:
- Profile the wrapper to find potential bottlenecks. The candidates for
bottlenecks are RDDConverterUtilsExt.vectorDataFrameToBinaryBlock and line
153-158 in LogisticRegressionModel.

3.  Perform detailed performance analysis of the converter utils.
- Also explore the usability aspect of these utils.

4. Add MLPipeline wrappers for existing scripts.
- Refer to
https://github.com/apache/incubator-systemml/tree/master/scripts/algorithms
to pick the algorithm and
http://apache.github.io/incubator-systemml/algorithms-reference.html to
understand the assumptions as well as parameters to the given algorithm.
- A good algorithm to start with is L2SVM:
http://apache.github.io/incubator-systemml/algorithms-classification.html#binary-class-support-vector-machines

https://github.com/apache/incubator-systemml/blob/master/scripts/algorithms/l2-svm.dml

https://github.com/apache/incubator-systemml/blob/master/scripts/algorithms/l2-svm-predict.dml


5. Add the documentation for MLPipeline wrappers to
http://apache.github.io/incubator-systemml/index.html

References:
1. Existing Logistic regression wrappers:
https://github.com/apache/incubator-systemml/blob/master/src/main/java/org/apache/sysml/api/ml/LogisticRegression.java

https://github.com/apache/incubator-systemml/blob/master/src/main/java/org/apache/sysml/api/ml/LogisticRegressionModel.java


2. Converter utils:
https://github.com/apache/incubator-systemml/blob/master/src/main/java/org/apache/sysml/runtime/instructions/spark/utils/RDDConverterUtilsExt.java


Thanks,

Niketan Pansare
IBM Almaden Research Center
E-mail: npansar At us.ibm.com
http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar