You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Takeshi Yamamuro <li...@gmail.com> on 2017/01/17 16:11:13 UTC

GraphX-related "open" issues

Hi, devs

Sorry to bother you, but plz let me check in advance;
in JIRA, there are some open (and inactive) issues about GraphX features.
IIUC the current GraphX features become almost freeze and
they possibly get no modification except for critical bugs.
So, IMO it seems okay to close tickets about "Improvement" and "New
Feature" for now.
Thought?

If any problem, please let me know; otherwise, I'll close the tickets below
this weekend;
---
New Feature SPARK-15880
 - PREGEL Based Semi-Clustering Algorithm Implementation using Spark GraphX
API
New Feature SPARK-13733
 - Support initial weight distribution in personalized PageRank
Improvement SPARK-13460
 - Applying Encoding methods to GraphX's Internal storage structure
New Feature SPARK-10758
 - approximation algorithms to speedup triangle count and clustering
coefficient computation in GraphX
Improvement SPARK-10335
 - GraphX Connected Components fail with large number of iterations
New Feature SPARK-8497
 - Graph Clique(Complete Connected Sub-graph) Discovery Algorithm
New Feature SPARK-7258
 - spark.ml API taking Graph instead of DataFrame
New Feature SPARK-7257
 - Find nearest neighbor satisfying predicate
New Feature SPARK-7244
 - Find vertex sequences satisfying predicates
New Feature SPARK-4763
 - All-pairs shortest paths algorithm
Improvement SPARK-3373
 - Filtering operations should optionally rebuild routing tables


-- 
---
Takeshi Yamamuro (maropu)

Re: GraphX-related "open" issues

Posted by Michael Allman <mi...@videoamp.com>.
Yes, SPARK-10335 is a bug that will be fixed when SPARK-5484 is fixed.

> On Jan 19, 2017, at 10:36 PM, Takeshi Yamamuro <li...@gmail.com> wrote:
> 
> IMO SPARK-10335 should be tagged with "Bug"? If so, I think we should not close it and fix in future.
> 
> // maropu
> 
> On Fri, Jan 20, 2017 at 1:27 PM, Michael Allman <michael@videoamp.com <ma...@videoamp.com>> wrote:
> That sounds fine to me. I think that in closing the issues, we should mention that we're closing them because these algorithms can be implemented using the existing API.
> 
> Michael
> 
> 
> 
>> On Jan 19, 2017, at 5:34 PM, Dongjin Lee <dongjin@apache.org <ma...@apache.org>> wrote:
>> 
>> Thanks for your comments. Then, How about change following issues (see below) into 'won't fix'? After Implementing & uploading them as Spark Packages, commenting on those issues would be a reasonable solution. It would also be better for the potential users of those graph algorithms.
>> 
>> - SPARK-15880: PREGEL Based Semi-Clustering Algorithm Implementation using Spark GraphX API <https://issues.apache.org/jira/browse/SPARK-15880>
>> - SPARK-7244: Find vertex sequences satisfying predicates <https://issues.apache.org/jira/browse/SPARK-7244>
>> - SPARK-7257: Find nearest neighbor satisfying predicate <https://issues.apache.org/jira/browse/SPARK-7257>
>> - SPARK-8497: Graph Clique(Complete Connected Sub-graph) Discovery Algorithm <https://issues.apache.org/jira/browse/SPARK-8497>
>> 
>> Best,
>> Dongjin
>> 
>> On Fri, Jan 20, 2017 at 2:48 AM, Michael Allman <michael@videoamp.com <ma...@videoamp.com>> wrote:
>> Regarding new GraphX algorithms, I am in agreement with the idea of publishing algorithms which are implemented using the existing API as outside packages.
>> 
>> Regarding SPARK-10335, we have a PR for SPARK-5484 which should address the problem described in that ticket. I've reviewed that PR, but because it touches the ML codebase I'd like to get an ML committer to review that PR. It's a relatively simple change and fixes an significant barrier to scaling in GraphX.
>> 
>> https://github.com/apache/spark/pull/15125 <https://github.com/apache/spark/pull/15125>
>> 
>> Cheers,
>> 
>> Michael
>> 
>> 
>>> On Jan 19, 2017, at 8:09 AM, Takeshi Yamamuro <linguin.m.s@gmail.com <ma...@gmail.com>> wrote:
>>> 
>>> Thanks for your comment, Dongjin!
>>> I have a pretty basic and also important question; why do you implement these features as  a third-party library (and then upload them to the spark packages https://spark-packages.org/ <https://spark-packages.org/>)? ISTM graphx has already necessary and sufficient APIs for these third-party ones.
>>> 
>>> On Thu, Jan 19, 2017 at 12:21 PM, Dongjin Lee <dongjin@apache.org <ma...@apache.org>> wrote:
>>> Hi all,
>>> 
>>> I am currently working on SPARK-15880[^1] and also have some interest on SPARK-7244[^2] and SPARK-7257[^3]. In fact, SPARK-7244 and SPARK-7257 have some importance on graph analysis field.
>>> Could you make them an exception? Since I am working on graph analysis, I hope to take them.
>>> 
>>> If needed, I can take SPARK-10335 and SPARK-8497 after them.
>>> 
>>> Thanks,
>>> Dongjin
>>> 
>>> On Wed, Jan 18, 2017 at 2:40 AM, Sean Owen <sowen@cloudera.com <ma...@cloudera.com>> wrote:
>>> WontFix or Later is fine. There's not really any practical distinction. I figure that if something times out and is closed, it's very unlikely to be looked at again. Therefore marking it as something to do 'later' seemed less accurate.
>>> 
>>> On Tue, Jan 17, 2017 at 5:30 PM Takeshi Yamamuro <linguin.m.s@gmail.com <ma...@gmail.com>> wrote:
>>> Thank for your comment!
>>> I'm just thinking I'll set "Won't Fix" though, "Later" is also okay.
>>> But, I re-checked "Contributing to JIRA Maintenance" in the contribution guide (http://spark.apache.org/contributing.html <http://spark.apache.org/contributing.html>) and
>>> I couldn't find any setting policy about "Later".
>>> So, IMO it's okay to set "Won't Fix" for now and those who'd like to make prs feel free to (re?-)open tickets.
>>> 
>>> 
>>> On Wed, Jan 18, 2017 at 1:48 AM, Dongjoon Hyun <dongjoon@apache.org <ma...@apache.org>> wrote:
>>> Hi, Takeshi.
>>> 
>>> > So, IMO it seems okay to close tickets about "Improvement" and "New Feature" for now.
>>> 
>>> I'm just wondering about what kind of field value you want to fill in the `Resolution` field for those issues.
>>> 
>>> Maybe, 'Later'? Or, 'Won't Fix'?
>>> 
>>> Bests,
>>> Dongjoon.
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org <ma...@spark.apache.org>
>>> 
>>> 
>>> 
>>> 
>>> -- 
>>> ---
>>> Takeshi Yamamuro
>>> 
>>> 
>>> 
>>> -- 
>>> Dongjin Lee
>>> 
>>> Software developer in Line+.
>>> So interested in massive-scale machine learning.
>>> 
>>> facebook: www.facebook.com/dongjin.lee.kr <http://www.facebook.com/dongjin.lee.kr>
>>> linkedin: kr.linkedin.com/in/dongjinleekr <http://kr.linkedin.com/in/dongjinleekr>
>>> github:  <http://goog_969573159/>github.com/dongjinleekr <http://github.com/dongjinleekr>
>>> twitter: www.twitter.com/dongjinleekr <http://www.twitter.com/dongjinleekr>
>>> 
>>> 
>>> -- 
>>> ---
>>> Takeshi Yamamuro
>> 
>> 
>> 
>> 
>> -- 
>> Dongjin Lee
>> 
>> Software developer in Line+.
>> So interested in massive-scale machine learning.
>> 
>> facebook: www.facebook.com/dongjin.lee.kr <http://www.facebook.com/dongjin.lee.kr>
>> linkedin: kr.linkedin.com/in/dongjinleekr <http://kr.linkedin.com/in/dongjinleekr>
>> github:  <http://goog_969573159/>github.com/dongjinleekr <http://github.com/dongjinleekr>
>> twitter: www.twitter.com/dongjinleekr <http://www.twitter.com/dongjinleekr>
> 
> 
> 
> -- 
> ---
> Takeshi Yamamuro


Re: GraphX-related "open" issues

Posted by Takeshi Yamamuro <li...@gmail.com>.
IMO SPARK-10335 should be tagged with "Bug"? If so, I think we should not
close it and fix in future.

// maropu

On Fri, Jan 20, 2017 at 1:27 PM, Michael Allman <mi...@videoamp.com>
wrote:

> That sounds fine to me. I think that in closing the issues, we should
> mention that we're closing them because these algorithms can be implemented
> using the existing API.
>
> Michael
>
>
>
> On Jan 19, 2017, at 5:34 PM, Dongjin Lee <do...@apache.org> wrote:
>
> Thanks for your comments. Then, How about change following issues (see
> below) into 'won't fix'? After Implementing & uploading them as Spark
> Packages, commenting on those issues would be a reasonable solution. It
> would also be better for the potential users of those graph algorithms.
>
> - SPARK-15880: PREGEL Based Semi-Clustering Algorithm Implementation
> using Spark GraphX API <https://issues.apache.org/jira/browse/SPARK-15880>
> - SPARK-7244: Find vertex sequences satisfying predicates
> <https://issues.apache.org/jira/browse/SPARK-7244>
> - SPARK-7257: Find nearest neighbor satisfying predicate
> <https://issues.apache.org/jira/browse/SPARK-7257>
> - SPARK-8497: Graph Clique(Complete Connected Sub-graph) Discovery
> Algorithm <https://issues.apache.org/jira/browse/SPARK-8497>
>
> Best,
> Dongjin
>
> On Fri, Jan 20, 2017 at 2:48 AM, Michael Allman <mi...@videoamp.com>
> wrote:
>
>> Regarding new GraphX algorithms, I am in agreement with the idea of
>> publishing algorithms which are implemented using the existing API as
>> outside packages.
>>
>> Regarding SPARK-10335, we have a PR for SPARK-5484 which should address
>> the problem described in that ticket. I've reviewed that PR, but because it
>> touches the ML codebase I'd like to get an ML committer to review that PR.
>> It's a relatively simple change and fixes an significant barrier to scaling
>> in GraphX.
>>
>> https://github.com/apache/spark/pull/15125
>>
>> Cheers,
>>
>> Michael
>>
>>
>> On Jan 19, 2017, at 8:09 AM, Takeshi Yamamuro <li...@gmail.com>
>> wrote:
>>
>> Thanks for your comment, Dongjin!
>> I have a pretty basic and also important question; why do you implement
>> these features as  a third-party library (and then upload them to the spark
>> packages https://spark-packages.org/)? ISTM graphx has already necessary
>> and sufficient APIs for these third-party ones.
>>
>> On Thu, Jan 19, 2017 at 12:21 PM, Dongjin Lee <do...@apache.org> wrote:
>>
>>> Hi all,
>>>
>>> I am currently working on SPARK-15880[^1] and also have some interest
>>> on SPARK-7244[^2] and SPARK-7257[^3]. In fact, SPARK-7244 and SPARK-7257
>>> have some importance on graph analysis field.
>>> Could you make them an exception? Since I am working on graph analysis,
>>> I hope to take them.
>>>
>>> If needed, I can take SPARK-10335 and SPARK-8497 after them.
>>>
>>> Thanks,
>>> Dongjin
>>>
>>> On Wed, Jan 18, 2017 at 2:40 AM, Sean Owen <so...@cloudera.com> wrote:
>>>
>>>> WontFix or Later is fine. There's not really any practical distinction.
>>>> I figure that if something times out and is closed, it's very unlikely to
>>>> be looked at again. Therefore marking it as something to do 'later' seemed
>>>> less accurate.
>>>>
>>>> On Tue, Jan 17, 2017 at 5:30 PM Takeshi Yamamuro <li...@gmail.com>
>>>> wrote:
>>>>
>>>>> Thank for your comment!
>>>>> I'm just thinking I'll set "Won't Fix" though, "Later" is also okay.
>>>>> But, I re-checked "Contributing to JIRA Maintenance" in the
>>>>> contribution guide (http://spark.apache.org/contributing.html) and
>>>>> I couldn't find any setting policy about "Later".
>>>>> So, IMO it's okay to set "Won't Fix" for now and those who'd like to
>>>>> make prs feel free to (re?-)open tickets.
>>>>>
>>>>>
>>>>> On Wed, Jan 18, 2017 at 1:48 AM, Dongjoon Hyun <do...@apache.org>
>>>>> wrote:
>>>>>
>>>>> Hi, Takeshi.
>>>>>
>>>>> > So, IMO it seems okay to close tickets about "Improvement" and "New
>>>>> Feature" for now.
>>>>>
>>>>> I'm just wondering about what kind of field value you want to fill in
>>>>> the `Resolution` field for those issues.
>>>>>
>>>>> Maybe, 'Later'? Or, 'Won't Fix'?
>>>>>
>>>>> Bests,
>>>>> Dongjoon.
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> ---
>>>>> Takeshi Yamamuro
>>>>>
>>>>
>>>
>>>
>>> --
>>> *Dongjin Lee*
>>>
>>>
>>> *Software developer in Line+.So interested in massive-scale machine
>>> learning.facebook: www.facebook.com/dongjin.lee.kr
>>> <http://www.facebook.com/dongjin.lee.kr>linkedin: kr.linkedin.com/in/dongjinleekr
>>> <http://kr.linkedin.com/in/dongjinleekr>github:
>>> <http://goog_969573159/>github.com/dongjinleekr
>>> <http://github.com/dongjinleekr>twitter: www.twitter.com/dongjinleekr
>>> <http://www.twitter.com/dongjinleekr>*
>>>
>>
>>
>>
>> --
>> ---
>> Takeshi Yamamuro
>>
>>
>>
>
>
> --
> *Dongjin Lee*
>
>
> *Software developer in Line+.So interested in massive-scale machine
> learning.facebook: www.facebook.com/dongjin.lee.kr
> <http://www.facebook.com/dongjin.lee.kr>linkedin: kr.linkedin.com/in/dongjinleekr
> <http://kr.linkedin.com/in/dongjinleekr>github:
> <http://goog_969573159/>github.com/dongjinleekr
> <http://github.com/dongjinleekr>twitter: www.twitter.com/dongjinleekr
> <http://www.twitter.com/dongjinleekr>*
>
>
>


-- 
---
Takeshi Yamamuro

Re: GraphX-related "open" issues

Posted by Michael Allman <mi...@videoamp.com>.
That sounds fine to me. I think that in closing the issues, we should mention that we're closing them because these algorithms can be implemented using the existing API.

Michael


> On Jan 19, 2017, at 5:34 PM, Dongjin Lee <do...@apache.org> wrote:
> 
> Thanks for your comments. Then, How about change following issues (see below) into 'won't fix'? After Implementing & uploading them as Spark Packages, commenting on those issues would be a reasonable solution. It would also be better for the potential users of those graph algorithms.
> 
> - SPARK-15880: PREGEL Based Semi-Clustering Algorithm Implementation using Spark GraphX API <https://issues.apache.org/jira/browse/SPARK-15880>
> - SPARK-7244: Find vertex sequences satisfying predicates <https://issues.apache.org/jira/browse/SPARK-7244>
> - SPARK-7257: Find nearest neighbor satisfying predicate <https://issues.apache.org/jira/browse/SPARK-7257>
> - SPARK-8497: Graph Clique(Complete Connected Sub-graph) Discovery Algorithm <https://issues.apache.org/jira/browse/SPARK-8497>
> 
> Best,
> Dongjin
> 
> On Fri, Jan 20, 2017 at 2:48 AM, Michael Allman <michael@videoamp.com <ma...@videoamp.com>> wrote:
> Regarding new GraphX algorithms, I am in agreement with the idea of publishing algorithms which are implemented using the existing API as outside packages.
> 
> Regarding SPARK-10335, we have a PR for SPARK-5484 which should address the problem described in that ticket. I've reviewed that PR, but because it touches the ML codebase I'd like to get an ML committer to review that PR. It's a relatively simple change and fixes an significant barrier to scaling in GraphX.
> 
> https://github.com/apache/spark/pull/15125 <https://github.com/apache/spark/pull/15125>
> 
> Cheers,
> 
> Michael
> 
> 
>> On Jan 19, 2017, at 8:09 AM, Takeshi Yamamuro <linguin.m.s@gmail.com <ma...@gmail.com>> wrote:
>> 
>> Thanks for your comment, Dongjin!
>> I have a pretty basic and also important question; why do you implement these features as  a third-party library (and then upload them to the spark packages https://spark-packages.org/ <https://spark-packages.org/>)? ISTM graphx has already necessary and sufficient APIs for these third-party ones.
>> 
>> On Thu, Jan 19, 2017 at 12:21 PM, Dongjin Lee <dongjin@apache.org <ma...@apache.org>> wrote:
>> Hi all,
>> 
>> I am currently working on SPARK-15880[^1] and also have some interest on SPARK-7244[^2] and SPARK-7257[^3]. In fact, SPARK-7244 and SPARK-7257 have some importance on graph analysis field.
>> Could you make them an exception? Since I am working on graph analysis, I hope to take them.
>> 
>> If needed, I can take SPARK-10335 and SPARK-8497 after them.
>> 
>> Thanks,
>> Dongjin
>> 
>> On Wed, Jan 18, 2017 at 2:40 AM, Sean Owen <sowen@cloudera.com <ma...@cloudera.com>> wrote:
>> WontFix or Later is fine. There's not really any practical distinction. I figure that if something times out and is closed, it's very unlikely to be looked at again. Therefore marking it as something to do 'later' seemed less accurate.
>> 
>> On Tue, Jan 17, 2017 at 5:30 PM Takeshi Yamamuro <linguin.m.s@gmail.com <ma...@gmail.com>> wrote:
>> Thank for your comment!
>> I'm just thinking I'll set "Won't Fix" though, "Later" is also okay.
>> But, I re-checked "Contributing to JIRA Maintenance" in the contribution guide (http://spark.apache.org/contributing.html <http://spark.apache.org/contributing.html>) and
>> I couldn't find any setting policy about "Later".
>> So, IMO it's okay to set "Won't Fix" for now and those who'd like to make prs feel free to (re?-)open tickets.
>> 
>> 
>> On Wed, Jan 18, 2017 at 1:48 AM, Dongjoon Hyun <dongjoon@apache.org <ma...@apache.org>> wrote:
>> Hi, Takeshi.
>> 
>> > So, IMO it seems okay to close tickets about "Improvement" and "New Feature" for now.
>> 
>> I'm just wondering about what kind of field value you want to fill in the `Resolution` field for those issues.
>> 
>> Maybe, 'Later'? Or, 'Won't Fix'?
>> 
>> Bests,
>> Dongjoon.
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org <ma...@spark.apache.org>
>> 
>> 
>> 
>> 
>> -- 
>> ---
>> Takeshi Yamamuro
>> 
>> 
>> 
>> -- 
>> Dongjin Lee
>> 
>> Software developer in Line+.
>> So interested in massive-scale machine learning.
>> 
>> facebook: www.facebook.com/dongjin.lee.kr <http://www.facebook.com/dongjin.lee.kr>
>> linkedin: kr.linkedin.com/in/dongjinleekr <http://kr.linkedin.com/in/dongjinleekr>
>> github:  <http://goog_969573159/>github.com/dongjinleekr <http://github.com/dongjinleekr>
>> twitter: www.twitter.com/dongjinleekr <http://www.twitter.com/dongjinleekr>
>> 
>> 
>> -- 
>> ---
>> Takeshi Yamamuro
> 
> 
> 
> 
> -- 
> Dongjin Lee
> 
> Software developer in Line+.
> So interested in massive-scale machine learning.
> 
> facebook: www.facebook.com/dongjin.lee.kr <http://www.facebook.com/dongjin.lee.kr>
> linkedin: kr.linkedin.com/in/dongjinleekr <http://kr.linkedin.com/in/dongjinleekr>
> github:  <http://goog_969573159/>github.com/dongjinleekr <http://github.com/dongjinleekr>
> twitter: www.twitter.com/dongjinleekr <http://www.twitter.com/dongjinleekr>

Re: GraphX-related "open" issues

Posted by Dongjin Lee <do...@apache.org>.
Thanks for your comments. Then, How about change following issues (see
below) into 'won't fix'? After Implementing & uploading them as Spark
Packages, commenting on those issues would be a reasonable solution. It
would also be better for the potential users of those graph algorithms.

- SPARK-15880: PREGEL Based Semi-Clustering Algorithm Implementation using
Spark GraphX API <https://issues.apache.org/jira/browse/SPARK-15880>
- SPARK-7244: Find vertex sequences satisfying predicates
<https://issues.apache.org/jira/browse/SPARK-7244>
- SPARK-7257: Find nearest neighbor satisfying predicate
<https://issues.apache.org/jira/browse/SPARK-7257>
- SPARK-8497: Graph Clique(Complete Connected Sub-graph) Discovery Algorithm
<https://issues.apache.org/jira/browse/SPARK-8497>

Best,
Dongjin

On Fri, Jan 20, 2017 at 2:48 AM, Michael Allman <mi...@videoamp.com>
wrote:

> Regarding new GraphX algorithms, I am in agreement with the idea of
> publishing algorithms which are implemented using the existing API as
> outside packages.
>
> Regarding SPARK-10335, we have a PR for SPARK-5484 which should address
> the problem described in that ticket. I've reviewed that PR, but because it
> touches the ML codebase I'd like to get an ML committer to review that PR.
> It's a relatively simple change and fixes an significant barrier to scaling
> in GraphX.
>
> https://github.com/apache/spark/pull/15125
>
> Cheers,
>
> Michael
>
>
> On Jan 19, 2017, at 8:09 AM, Takeshi Yamamuro <li...@gmail.com>
> wrote:
>
> Thanks for your comment, Dongjin!
> I have a pretty basic and also important question; why do you implement
> these features as  a third-party library (and then upload them to the spark
> packages https://spark-packages.org/)? ISTM graphx has already necessary
> and sufficient APIs for these third-party ones.
>
> On Thu, Jan 19, 2017 at 12:21 PM, Dongjin Lee <do...@apache.org> wrote:
>
>> Hi all,
>>
>> I am currently working on SPARK-15880[^1] and also have some interest
>> on SPARK-7244[^2] and SPARK-7257[^3]. In fact, SPARK-7244 and SPARK-7257
>> have some importance on graph analysis field.
>> Could you make them an exception? Since I am working on graph analysis, I
>> hope to take them.
>>
>> If needed, I can take SPARK-10335 and SPARK-8497 after them.
>>
>> Thanks,
>> Dongjin
>>
>> On Wed, Jan 18, 2017 at 2:40 AM, Sean Owen <so...@cloudera.com> wrote:
>>
>>> WontFix or Later is fine. There's not really any practical distinction.
>>> I figure that if something times out and is closed, it's very unlikely to
>>> be looked at again. Therefore marking it as something to do 'later' seemed
>>> less accurate.
>>>
>>> On Tue, Jan 17, 2017 at 5:30 PM Takeshi Yamamuro <li...@gmail.com>
>>> wrote:
>>>
>>>> Thank for your comment!
>>>> I'm just thinking I'll set "Won't Fix" though, "Later" is also okay.
>>>> But, I re-checked "Contributing to JIRA Maintenance" in the
>>>> contribution guide (http://spark.apache.org/contributing.html) and
>>>> I couldn't find any setting policy about "Later".
>>>> So, IMO it's okay to set "Won't Fix" for now and those who'd like to
>>>> make prs feel free to (re?-)open tickets.
>>>>
>>>>
>>>> On Wed, Jan 18, 2017 at 1:48 AM, Dongjoon Hyun <do...@apache.org>
>>>> wrote:
>>>>
>>>> Hi, Takeshi.
>>>>
>>>> > So, IMO it seems okay to close tickets about "Improvement" and "New
>>>> Feature" for now.
>>>>
>>>> I'm just wondering about what kind of field value you want to fill in
>>>> the `Resolution` field for those issues.
>>>>
>>>> Maybe, 'Later'? Or, 'Won't Fix'?
>>>>
>>>> Bests,
>>>> Dongjoon.
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> ---
>>>> Takeshi Yamamuro
>>>>
>>>
>>
>>
>> --
>> *Dongjin Lee*
>>
>>
>> *Software developer in Line+.So interested in massive-scale machine
>> learning.facebook: www.facebook.com/dongjin.lee.kr
>> <http://www.facebook.com/dongjin.lee.kr>linkedin: kr.linkedin.com/in/dongjinleekr
>> <http://kr.linkedin.com/in/dongjinleekr>github:
>> <http://goog_969573159/>github.com/dongjinleekr
>> <http://github.com/dongjinleekr>twitter: www.twitter.com/dongjinleekr
>> <http://www.twitter.com/dongjinleekr>*
>>
>
>
>
> --
> ---
> Takeshi Yamamuro
>
>
>


-- 
*Dongjin Lee*


*Software developer in Line+.So interested in massive-scale machine
learning.facebook: www.facebook.com/dongjin.lee.kr
<http://www.facebook.com/dongjin.lee.kr>linkedin:
kr.linkedin.com/in/dongjinleekr
<http://kr.linkedin.com/in/dongjinleekr>github:
<http://goog_969573159/>github.com/dongjinleekr
<http://github.com/dongjinleekr>twitter: www.twitter.com/dongjinleekr
<http://www.twitter.com/dongjinleekr>*

Re: GraphX-related "open" issues

Posted by Michael Allman <mi...@videoamp.com>.
Regarding new GraphX algorithms, I am in agreement with the idea of publishing algorithms which are implemented using the existing API as outside packages.

Regarding SPARK-10335, we have a PR for SPARK-5484 which should address the problem described in that ticket. I've reviewed that PR, but because it touches the ML codebase I'd like to get an ML committer to review that PR. It's a relatively simple change and fixes an significant barrier to scaling in GraphX.

https://github.com/apache/spark/pull/15125

Cheers,

Michael


> On Jan 19, 2017, at 8:09 AM, Takeshi Yamamuro <li...@gmail.com> wrote:
> 
> Thanks for your comment, Dongjin!
> I have a pretty basic and also important question; why do you implement these features as  a third-party library (and then upload them to the spark packages https://spark-packages.org/ <https://spark-packages.org/>)? ISTM graphx has already necessary and sufficient APIs for these third-party ones.
> 
> On Thu, Jan 19, 2017 at 12:21 PM, Dongjin Lee <dongjin@apache.org <ma...@apache.org>> wrote:
> Hi all,
> 
> I am currently working on SPARK-15880[^1] and also have some interest on SPARK-7244[^2] and SPARK-7257[^3]. In fact, SPARK-7244 and SPARK-7257 have some importance on graph analysis field.
> Could you make them an exception? Since I am working on graph analysis, I hope to take them.
> 
> If needed, I can take SPARK-10335 and SPARK-8497 after them.
> 
> Thanks,
> Dongjin
> 
> On Wed, Jan 18, 2017 at 2:40 AM, Sean Owen <sowen@cloudera.com <ma...@cloudera.com>> wrote:
> WontFix or Later is fine. There's not really any practical distinction. I figure that if something times out and is closed, it's very unlikely to be looked at again. Therefore marking it as something to do 'later' seemed less accurate.
> 
> On Tue, Jan 17, 2017 at 5:30 PM Takeshi Yamamuro <linguin.m.s@gmail.com <ma...@gmail.com>> wrote:
> Thank for your comment!
> I'm just thinking I'll set "Won't Fix" though, "Later" is also okay.
> But, I re-checked "Contributing to JIRA Maintenance" in the contribution guide (http://spark.apache.org/contributing.html <http://spark.apache.org/contributing.html>) and
> I couldn't find any setting policy about "Later".
> So, IMO it's okay to set "Won't Fix" for now and those who'd like to make prs feel free to (re?-)open tickets.
> 
> 
> On Wed, Jan 18, 2017 at 1:48 AM, Dongjoon Hyun <dongjoon@apache.org <ma...@apache.org>> wrote:
> Hi, Takeshi.
> 
> > So, IMO it seems okay to close tickets about "Improvement" and "New Feature" for now.
> 
> I'm just wondering about what kind of field value you want to fill in the `Resolution` field for those issues.
> 
> Maybe, 'Later'? Or, 'Won't Fix'?
> 
> Bests,
> Dongjoon.
> 
> ---------------------------------------------------------------------
> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org <ma...@spark.apache.org>
> 
> 
> 
> 
> -- 
> ---
> Takeshi Yamamuro
> 
> 
> 
> -- 
> Dongjin Lee
> 
> Software developer in Line+.
> So interested in massive-scale machine learning.
> 
> facebook: www.facebook.com/dongjin.lee.kr <http://www.facebook.com/dongjin.lee.kr>
> linkedin: kr.linkedin.com/in/dongjinleekr <http://kr.linkedin.com/in/dongjinleekr>
> github:  <http://goog_969573159/>github.com/dongjinleekr <http://github.com/dongjinleekr>
> twitter: www.twitter.com/dongjinleekr <http://www.twitter.com/dongjinleekr>
> 
> 
> -- 
> ---
> Takeshi Yamamuro


Re: GraphX-related "open" issues

Posted by Takeshi Yamamuro <li...@gmail.com>.
Thanks for your comment, Dongjin!
I have a pretty basic and also important question; why do you implement
these features as  a third-party library (and then upload them to the spark
packages https://spark-packages.org/)? ISTM graphx has already necessary
and sufficient APIs for these third-party ones.

On Thu, Jan 19, 2017 at 12:21 PM, Dongjin Lee <do...@apache.org> wrote:

> Hi all,
>
> I am currently working on SPARK-15880[^1] and also have some interest
> on SPARK-7244[^2] and SPARK-7257[^3]. In fact, SPARK-7244 and SPARK-7257
> have some importance on graph analysis field.
> Could you make them an exception? Since I am working on graph analysis, I
> hope to take them.
>
> If needed, I can take SPARK-10335 and SPARK-8497 after them.
>
> Thanks,
> Dongjin
>
> On Wed, Jan 18, 2017 at 2:40 AM, Sean Owen <so...@cloudera.com> wrote:
>
>> WontFix or Later is fine. There's not really any practical distinction. I
>> figure that if something times out and is closed, it's very unlikely to be
>> looked at again. Therefore marking it as something to do 'later' seemed
>> less accurate.
>>
>> On Tue, Jan 17, 2017 at 5:30 PM Takeshi Yamamuro <li...@gmail.com>
>> wrote:
>>
>>> Thank for your comment!
>>> I'm just thinking I'll set "Won't Fix" though, "Later" is also okay.
>>> But, I re-checked "Contributing to JIRA Maintenance" in the contribution
>>> guide (http://spark.apache.org/contributing.html) and
>>> I couldn't find any setting policy about "Later".
>>> So, IMO it's okay to set "Won't Fix" for now and those who'd like to
>>> make prs feel free to (re?-)open tickets.
>>>
>>>
>>> On Wed, Jan 18, 2017 at 1:48 AM, Dongjoon Hyun <do...@apache.org>
>>> wrote:
>>>
>>> Hi, Takeshi.
>>>
>>> > So, IMO it seems okay to close tickets about "Improvement" and "New
>>> Feature" for now.
>>>
>>> I'm just wondering about what kind of field value you want to fill in
>>> the `Resolution` field for those issues.
>>>
>>> Maybe, 'Later'? Or, 'Won't Fix'?
>>>
>>> Bests,
>>> Dongjoon.
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>>>
>>>
>>>
>>>
>>> --
>>> ---
>>> Takeshi Yamamuro
>>>
>>
>
>
> --
> *Dongjin Lee*
>
>
> *Software developer in Line+.So interested in massive-scale machine
> learning.facebook: www.facebook.com/dongjin.lee.kr
> <http://www.facebook.com/dongjin.lee.kr>linkedin: kr.linkedin.com/in/dongjinleekr
> <http://kr.linkedin.com/in/dongjinleekr>github:
> <http://goog_969573159/>github.com/dongjinleekr
> <http://github.com/dongjinleekr>twitter: www.twitter.com/dongjinleekr
> <http://www.twitter.com/dongjinleekr>*
>



-- 
---
Takeshi Yamamuro

Re: GraphX-related "open" issues

Posted by Dongjin Lee <do...@apache.org>.
Hi all,

I am currently working on SPARK-15880[^1] and also have some interest
on SPARK-7244[^2] and SPARK-7257[^3]. In fact, SPARK-7244 and SPARK-7257
have some importance on graph analysis field.
Could you make them an exception? Since I am working on graph analysis, I
hope to take them.

If needed, I can take SPARK-10335 and SPARK-8497 after them.

Thanks,
Dongjin

On Wed, Jan 18, 2017 at 2:40 AM, Sean Owen <so...@cloudera.com> wrote:

> WontFix or Later is fine. There's not really any practical distinction. I
> figure that if something times out and is closed, it's very unlikely to be
> looked at again. Therefore marking it as something to do 'later' seemed
> less accurate.
>
> On Tue, Jan 17, 2017 at 5:30 PM Takeshi Yamamuro <li...@gmail.com>
> wrote:
>
>> Thank for your comment!
>> I'm just thinking I'll set "Won't Fix" though, "Later" is also okay.
>> But, I re-checked "Contributing to JIRA Maintenance" in the contribution
>> guide (http://spark.apache.org/contributing.html) and
>> I couldn't find any setting policy about "Later".
>> So, IMO it's okay to set "Won't Fix" for now and those who'd like to make
>> prs feel free to (re?-)open tickets.
>>
>>
>> On Wed, Jan 18, 2017 at 1:48 AM, Dongjoon Hyun <do...@apache.org>
>> wrote:
>>
>> Hi, Takeshi.
>>
>> > So, IMO it seems okay to close tickets about "Improvement" and "New
>> Feature" for now.
>>
>> I'm just wondering about what kind of field value you want to fill in the
>> `Resolution` field for those issues.
>>
>> Maybe, 'Later'? Or, 'Won't Fix'?
>>
>> Bests,
>> Dongjoon.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>>
>>
>>
>>
>> --
>> ---
>> Takeshi Yamamuro
>>
>


-- 
*Dongjin Lee*


*Software developer in Line+.So interested in massive-scale machine
learning.facebook: www.facebook.com/dongjin.lee.kr
<http://www.facebook.com/dongjin.lee.kr>linkedin:
kr.linkedin.com/in/dongjinleekr
<http://kr.linkedin.com/in/dongjinleekr>github:
<http://goog_969573159/>github.com/dongjinleekr
<http://github.com/dongjinleekr>twitter: www.twitter.com/dongjinleekr
<http://www.twitter.com/dongjinleekr>*

Re: GraphX-related "open" issues

Posted by Sean Owen <so...@cloudera.com>.
WontFix or Later is fine. There's not really any practical distinction. I
figure that if something times out and is closed, it's very unlikely to be
looked at again. Therefore marking it as something to do 'later' seemed
less accurate.

On Tue, Jan 17, 2017 at 5:30 PM Takeshi Yamamuro <li...@gmail.com>
wrote:

> Thank for your comment!
> I'm just thinking I'll set "Won't Fix" though, "Later" is also okay.
> But, I re-checked "Contributing to JIRA Maintenance" in the contribution
> guide (http://spark.apache.org/contributing.html) and
> I couldn't find any setting policy about "Later".
> So, IMO it's okay to set "Won't Fix" for now and those who'd like to make
> prs feel free to (re?-)open tickets.
>
>
> On Wed, Jan 18, 2017 at 1:48 AM, Dongjoon Hyun <do...@apache.org>
> wrote:
>
> Hi, Takeshi.
>
> > So, IMO it seems okay to close tickets about "Improvement" and "New
> Feature" for now.
>
> I'm just wondering about what kind of field value you want to fill in the
> `Resolution` field for those issues.
>
> Maybe, 'Later'? Or, 'Won't Fix'?
>
> Bests,
> Dongjoon.
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>
>
>
>
> --
> ---
> Takeshi Yamamuro
>

Re: GraphX-related "open" issues

Posted by Takeshi Yamamuro <li...@gmail.com>.
Thank for your comment!
I'm just thinking I'll set "Won't Fix" though, "Later" is also okay.
But, I re-checked "Contributing to JIRA Maintenance" in the contribution
guide (http://spark.apache.org/contributing.html) and
I couldn't find any setting policy about "Later".
So, IMO it's okay to set "Won't Fix" for now and those who'd like to make
prs feel free to (re?-)open tickets.


On Wed, Jan 18, 2017 at 1:48 AM, Dongjoon Hyun <do...@apache.org> wrote:

> Hi, Takeshi.
>
> > So, IMO it seems okay to close tickets about "Improvement" and "New
> Feature" for now.
>
> I'm just wondering about what kind of field value you want to fill in the
> `Resolution` field for those issues.
>
> Maybe, 'Later'? Or, 'Won't Fix'?
>
> Bests,
> Dongjoon.
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>
>


-- 
---
Takeshi Yamamuro

Re: GraphX-related "open" issues

Posted by Dongjoon Hyun <do...@apache.org>.
Hi, Takeshi.

> So, IMO it seems okay to close tickets about "Improvement" and "New Feature" for now.

I'm just wondering about what kind of field value you want to fill in the `Resolution` field for those issues.

Maybe, 'Later'? Or, 'Won't Fix'?

Bests,
Dongjoon.

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org