You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tinkerpop.apache.org by GitBox <gi...@apache.org> on 2020/12/22 10:52:18 UTC

[GitHub] [tinkerpop] junshiguo opened a new pull request #1375: TINKERPOP-2487 add stdev and percentile steps

junshiguo opened a new pull request #1375:
URL: https://github.com/apache/tinkerpop/pull/1375


   https://issues.apache.org/jira/browse/TINKERPOP-2487
   
   Added two frequently used analytical steps for calculation of standard deviation and percentile value. The example usage is
   
   ```
   gremlin> g.V().values('ages')
   ==>1
   ==>2
   ==>3
   gremlin> g.V().values('ages').stdev()
   ==>0.816
   gremlin> g.V().values('ages').fold().stdev(Scope.local)
   ==>0.816
   
   gremlin> g.V().values('ages').percentile(50)
   ==>2
   // one percentile, return single value
   gremlin> g.V().values('ages').percentile(0, 100)
   ==>[0: 1, 100: 3]
   // multiple percentiles, return a map
   ```
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tinkerpop] junshiguo commented on pull request #1375: TINKERPOP-2487 add stdev and percentile steps

Posted by GitBox <gi...@apache.org>.
junshiguo commented on pull request #1375:
URL: https://github.com/apache/tinkerpop/pull/1375#issuecomment-778167250


   @spmallette Sorry I missed the messages. I am not very familiar with the GLV tests and having trouble fixing the test failures. It seems like the generated `gremlin-javascript/src/main/javascript/gremlin-javascript/test/cucumber/gremlin.js` is not correct. Would you please help here?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tinkerpop] krlawrence commented on pull request #1375: TINKERPOP-2487 add stdev and percentile steps

Posted by GitBox <gi...@apache.org>.
krlawrence commented on pull request #1375:
URL: https://github.com/apache/tinkerpop/pull/1375#issuecomment-1080966381


   This PR looked promising but appears to have stalled. Are you considering completing this work @junshiguo ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tinkerpop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tinkerpop] krlawrence commented on pull request #1375: TINKERPOP-2487 add stdev and percentile steps

Posted by GitBox <gi...@apache.org>.
krlawrence commented on pull request #1375:
URL: https://github.com/apache/tinkerpop/pull/1375#issuecomment-755761973


   That sounds like a sensible compromise @spmallette - for 3.5.0 it would be nice to try to get whatever remaining steps we think are  needed by statisticians included as part of the release.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tinkerpop] spmallette commented on pull request #1375: TINKERPOP-2487 add stdev and percentile steps

Posted by GitBox <gi...@apache.org>.
spmallette commented on pull request #1375:
URL: https://github.com/apache/tinkerpop/pull/1375#issuecomment-755758466


   i'm not against accepting this with just the steps present but i see where you are coming from @krlawrence . 
   
   does it specifically need to be part of this PR? perhaps we could just make the additional steps a blocker for 3.5.0 and not release without it? how does that sound? 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tinkerpop] spmallette commented on pull request #1375: TINKERPOP-2487 add stdev and percentile steps

Posted by GitBox <gi...@apache.org>.
spmallette commented on pull request #1375:
URL: https://github.com/apache/tinkerpop/pull/1375#issuecomment-773983784


   Hi @junshiguo - I was just wondering if you planned to return to this pull request? If not, I will plan to pick up where you left off. Thanks for your efforts so far.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tinkerpop] krlawrence commented on pull request #1375: TINKERPOP-2487 add stdev and percentile steps

Posted by GitBox <gi...@apache.org>.
krlawrence commented on pull request #1375:
URL: https://github.com/apache/tinkerpop/pull/1375#issuecomment-755731427


   I would really like to see a `product` step added as part of this PR. I think it only adds `stddev` and `percentile` from looking at the code. If we are going to add new steps for statisticians,  I would prefer we do the complete set rather than do this piecemeal. I had also suggested above that `variance` would be appropriate to also add.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tinkerpop] spmallette commented on pull request #1375: TINKERPOP-2487 add stdev and percentile steps

Posted by GitBox <gi...@apache.org>.
spmallette commented on pull request #1375:
URL: https://github.com/apache/tinkerpop/pull/1375#issuecomment-755351381


   @junshiguo sorry, it's taken some time to get to reviewing this. i'll start with some bigger picture feedback first:
   
   1. Could you please add some [Upgrade Documentation](https://github.com/apache/tinkerpop/blob/a78fc7dbe26f46f01052ea636eb3b2f394e98d74/docs/src/upgrade/release-3.5.x.asciidoc) to call attention to this new feature. 
   2. I see that you supplied changes for .NET/Javascript - will you be able to add python as well?
   3. I see some tests, but don't see any tests for the Gremlin Process Suite ([example](https://github.com/apache/tinkerpop/blob/64ec9fbc9e6d368f1a251b25cf7b120ace09298c/gremlin-test/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/step/map/MeanTest.java)) and GLV suite ([example](https://github.com/apache/tinkerpop/blob/64ec9fbc9e6d368f1a251b25cf7b120ace09298c/gremlin-test/features/map/Mean.feature))
   
   Once those more major items are settled I can look at the code itself more carefully. Thanks!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tinkerpop] spmallette commented on pull request #1375: TINKERPOP-2487 add stdev and percentile steps

Posted by GitBox <gi...@apache.org>.
spmallette commented on pull request #1375:
URL: https://github.com/apache/tinkerpop/pull/1375#issuecomment-764631171


   Hello @junshiguo , I just thought I'd check in on this PR. I noticed there were still some test failures here. Did you need some help looking into them or have you just not had time to come back to this yet?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tinkerpop] spmallette commented on pull request #1375: TINKERPOP-2487 add stdev and percentile steps

Posted by GitBox <gi...@apache.org>.
spmallette commented on pull request #1375:
URL: https://github.com/apache/tinkerpop/pull/1375#issuecomment-764631171


   Hello @junshiguo , I just thought I'd check in on this PR. I noticed there were still some test failures here. Did you need some help looking into them or have you just not had time to come back to this yet?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tinkerpop] junshiguo closed pull request #1375: TINKERPOP-2487 add stdev and percentile steps

Posted by GitBox <gi...@apache.org>.
junshiguo closed pull request #1375:
URL: https://github.com/apache/tinkerpop/pull/1375


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tinkerpop] spmallette commented on pull request #1375: TINKERPOP-2487 add stdev and percentile steps

Posted by GitBox <gi...@apache.org>.
spmallette commented on pull request #1375:
URL: https://github.com/apache/tinkerpop/pull/1375#issuecomment-779802925


   After looking at this in more detail, I think that there are several things to discuss:
   
   1. Part of the problem is in the `JavaTranslator` which isn't recognized the signature of your `int` varargs because it checks for `Object[]` and not the primitive forms. That's worked until now. I'm not sure how we best resolve that, but I'm not sure it's best to keep hacking in "side cases" to `JavaTranslator`. 
   2. I wonder if `int` is the right data type there. Looking at SQL and Cypher their approach is to use double between 0.0 and 1.0.
   3. I wonder about the return type which changes based on the number of percentiles - i.e. you get a number for a single percentile but a `Map` for multiple. There is some precedence for that with `select()` though.
   
   I think that items 2 and 3 probably need to be brought to the dev list for further discussion to see if anyone has any thoughts on the matter. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tinkerpop] spmallette commented on pull request #1375: TINKERPOP-2487 add stdev and percentile steps

Posted by GitBox <gi...@apache.org>.
spmallette commented on pull request #1375:
URL: https://github.com/apache/tinkerpop/pull/1375#issuecomment-758646071


   Thanks for updating the PR and offering to do the `product` and `variance` work on a future PR. Looks like there are some test failures with gremlin-javascript which is causing travis to not pass the build.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tinkerpop] krlawrence edited a comment on pull request #1375: TINKERPOP-2487 add stdev and percentile steps

Posted by GitBox <gi...@apache.org>.
krlawrence edited a comment on pull request #1375:
URL: https://github.com/apache/tinkerpop/pull/1375#issuecomment-755731427


   I would really like to see a `product` step added as part of this PR. I think it only adds `stddev` and `percentile` from looking at the code. If we are going to add new steps for statisticians,  I would prefer we do the complete set rather than do this piecemeal. I had also suggested in the earlier discussions that `variance` would be appropriate to also add.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tinkerpop] junshiguo commented on pull request #1375: TINKERPOP-2487 add stdev and percentile steps

Posted by GitBox <gi...@apache.org>.
junshiguo commented on pull request #1375:
URL: https://github.com/apache/tinkerpop/pull/1375#issuecomment-758603983


   @spmallette Sorry for the late reply. The PR is updated with documentation, python support and tests as requested. 
   `product` and `variance` steps implementation should be very similar. I will implement them later and submit for review.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org