You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@metron.apache.org by cestella <gi...@git.apache.org> on 2017/09/01 14:52:45 UTC

[GitHub] metron pull request #728: METRON-1148: Add SET and MULTISET data structures ...

GitHub user cestella opened a pull request:

    https://github.com/apache/metron/pull/728

    METRON-1148: Add SET and MULTISET data structures to stellar

    ## Contributor Comments
    With the addition of geohashes, to do analytics like tracking the statistical distribution of the distances of a user's login against the centroid of the user logins across some time, there is a need to be able to store sets (e.g. sets of geohashes) and multisets (sets with multiplicity) in a way that they can be stored by the profiler and merged across time.
    This JIRA should add:
    * SET_INIT
    * SET_ADD
    * SET_REMOVE
    * SET_MERGE
    * MULTISET_INIT
    * MULTISET_ADD
    * MULTISET_REMOVE
    * MULTISET_MERGE
    * MULTISET_TO_SET
    These follow the pattern of the other data structures (that are not stellar language primitives)
    
    You can tinker with these in the stellar REPL as manual tests.
    
    ## Pull Request Checklist
    
    Thank you for submitting a contribution to Apache Metron.  
    Please refer to our [Development Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235) for the complete guide to follow for contributions.  
    Please refer also to our [Build Verification Guidelines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview) for complete smoke testing guides.  
    
    
    In order to streamline the review of the contribution we ask you follow these guidelines and ask you to double check the following:
    
    ### For all changes:
    - [x] Is there a JIRA ticket associated with this PR? If not one needs to be created at [Metron Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel). 
    - [x] Does your PR title start with METRON-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.
    - [x] Has your PR been rebased against the latest commit within the target branch (typically master)?
    
    
    ### For code changes:
    - [x] Have you included steps to reproduce the behavior or problem that is being changed or addressed?
    - [x] Have you included steps or a guide to how the change may be verified and tested manually?
    - [x] Have you ensured that the full suite of tests and checks have been executed in the root metron folder via:
      ```
      mvn -q clean integration-test install && build_utils/verify_licenses.sh 
      ```
    
    - [x] Have you written or updated unit tests and or integration tests to verify your changes?
    - [x] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? 
    - [x] Have you verified the basic functionality of the build by building and running locally with Vagrant full-dev environment or the equivalent?
    
    ### For documentation related changes:
    - [x] Have you ensured that format looks appropriate for the output in which it is rendered by building and verifying the site-book? If not then run the following commands and the verify changes via `site-book/target/site/index.html`:
    
      ```
      cd site-book
      mvn site
      ```
    
    #### Note:
    Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible.
    It is also recommended that [travis-ci](https://travis-ci.org) is set up for your personal repository such that your branches are built there before submitting a pull request.
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/cestella/incubator-metron count_maps_for_stellar

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/metron/pull/728.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #728
    
----
commit 139d962af756a3f906d4bbcc7a7d8c8bab9d9299
Author: cstella <ce...@gmail.com>
Date:   2017-08-31T20:38:25Z

    adding sets and multisets.

commit 83a08fc8a1b2c6d7426fdcea9166e49ffb2bb0f8
Author: cstella <ce...@gmail.com>
Date:   2017-09-01T01:16:56Z

    tests added

commit 3683fc76151f0b126f8dc7d633e39c9e8eb4aa94
Author: cstella <ce...@gmail.com>
Date:   2017-09-01T14:46:55Z

    Adding better docs and a MULTISET_TO_SET function.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] metron issue #728: METRON-1148: Add SET and MULTISET data structures to stel...

Posted by ottobackwards <gi...@git.apache.org>.
Github user ottobackwards commented on the issue:

    https://github.com/apache/metron/pull/728
  
    Thanks for the contribution!  This looks really nice.
    The question is have is the validation of parameters and the default return.  With the FUZZY_SCORE effort, we discussed returning a default 0 score when input parameters where wrong, and changed so that errors in parameters where known.  This seems to me to be kind of the same thing.  If a non-iterable is passed, should it not be an error?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] metron issue #728: METRON-1148: Add SET and MULTISET data structures to stel...

Posted by cestella <gi...@git.apache.org>.
Github user cestella commented on the issue:

    https://github.com/apache/metron/pull/728
  
    Nevermind, I know what you're talking about now.  Sorry, misunderstood :)  Yeah, I'll do a quick check and ensure that exceptions are thrown where appropriate.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] metron pull request #728: METRON-1148: Add SET and MULTISET data structures ...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/metron/pull/728


---

[GitHub] metron issue #728: METRON-1148: Add SET and MULTISET data structures to stel...

Posted by cestella <gi...@git.apache.org>.
Github user cestella commented on the issue:

    https://github.com/apache/metron/pull/728
  
    Ok, let me know if I missed any that you think should be there.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] metron issue #728: METRON-1148: Add SET and MULTISET data structures to stel...

Posted by cestella <gi...@git.apache.org>.
Github user cestella commented on the issue:

    https://github.com/apache/metron/pull/728
  
    Which method are we talking about, @ottobackwards ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] metron issue #728: METRON-1148: Add SET and MULTISET data structures to stel...

Posted by ottobackwards <gi...@git.apache.org>.
Github user ottobackwards commented on the issue:

    https://github.com/apache/metron/pull/728
  
    Thanks Casey, +1 by inspection


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---