You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hivemall.apache.org by myui <gi...@git.apache.org> on 2016/12/02 08:24:33 UTC

[GitHub] incubator-hivemall pull request #7: [WIP] Support Feature Selection UDFs

GitHub user myui opened a pull request:

    https://github.com/apache/incubator-hivemall/pull/7

    [WIP] Support Feature Selection UDFs

    This PR introduces two feature selection schemes: `Chi-Square test` and `Signal Noise Ratio`.
    
    This PR is based on [a pending PR](https://github.com/myui/hivemall/pull/385) by @amaya382 that is sent before Hivemall entered Apache Incubator.
    
    See [JIRA](https://issues.apache.org/jira/browse/HIVEMALL-22
    ) for tracking the status of this issue.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/myui/incubator-hivemall JIRA-22/pr-385

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-hivemall/pull/7.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #7
    
----
commit 2dc176a760b553214624e98f885a719ee196cc4e
Author: amaya <gi...@sapphire.in.net>
Date:   2016-09-16T06:46:44Z

    add HiveUtils.isNumberListOI

commit 56adf2d4e8b2591c31b846b8980016d3dafdbacc
Author: amaya <gi...@sapphire.in.net>
Date:   2016-09-16T06:48:33Z

    add HiveUtils.asDoubleOI

commit 6f9b4fa0acebf604882240ccd5507d9df45bab2d
Author: amaya <gi...@sapphire.in.net>
Date:   2016-09-16T06:52:54Z

    add transpose_and_dot

commit d3009be59bcf314b373038e3db8903a041396931
Author: amaya <gi...@sapphire.in.net>
Date:   2016-09-16T07:00:58Z

    add chi2 and chi2_test

commit d8f1005bb9fbf769b117290582bed18d7607a94a
Author: amaya <gi...@sapphire.in.net>
Date:   2016-09-20T03:01:46Z

    mod number format

commit d0e97e6ff71b2072ec5235cc3ac169162d59da59
Author: amaya <gi...@sapphire.in.net>
Date:   2016-09-20T03:02:28Z

    add HiveUtils.isNumberListListOI

commit 7b07e4a6e1f700ba0a6e5b68659a040a3d89aa2f
Author: amaya <gi...@sapphire.in.net>
Date:   2016-09-20T03:03:44Z

    change interface of chi2

commit e9d1a94f29f31e2910a54add7c2625825d715318
Author: amaya <gi...@sapphire.in.net>
Date:   2016-09-20T07:55:57Z

    add array_top_k_indices

commit 1ab9b0974ca4203c00175469b7b75d5b65209547
Author: amaya <gi...@sapphire.in.net>
Date:   2016-09-20T07:56:15Z

    add subarray_by_indices

commit ad81b3aa5a0bbb7c248d127ba44608578c01ae00
Author: amaya <gi...@sapphire.in.net>
Date:   2016-09-20T08:05:55Z

    add license and format

commit be1ea37a0f5048cde4284107c04e109f0f526b42
Author: amaya <gi...@sapphire.in.net>
Date:   2016-09-20T09:00:49Z

    add ddl definitions

commit 89c81aacf5b13f6e125723cb5c703333574c10ae
Author: amaya <gi...@sapphire.in.net>
Date:   2016-09-21T01:56:59Z

    change to select_k_best

commit 6dc234490dc25f563b22e5659c378e6ebcf8dcdb
Author: amaya <gi...@sapphire.in.net>
Date:   2016-09-21T02:41:59Z

    standardize to chi2

commit a16a3fde844ba381dee7eb1e9608ddc2dcfb96fc
Author: amaya <gi...@sapphire.in.net>
Date:   2016-09-21T04:10:18Z

    refine chi2

commit abbf5492b95dd69e347580c59ac044a78627c547
Author: amaya <gi...@sapphire.in.net>
Date:   2016-09-21T04:11:00Z

    refine transpose_and_dot

commit b8cf39684496f2511e59294041d443b9438394a9
Author: amaya <gi...@sapphire.in.net>
Date:   2016-09-21T06:02:12Z

    fix chi2

commit a882c5f9f8067b911254dfc43d268de06a5490f9
Author: amaya <gi...@sapphire.in.net>
Date:   2016-09-21T07:00:36Z

    mod chi2 function name

commit 5088ef36367df1cd51ae62f1c044933676975e2e
Author: amaya <gi...@sapphire.in.net>
Date:   2016-09-21T07:22:09Z

    add tests

commit 22a608ee1c7239b2953183b5341f80c58b1e7045
Author: amaya <gi...@sapphire.in.net>
Date:   2016-09-26T08:07:55Z

    add snr

commit a1f8f958c99f3cde9e48b6d80d364004f6d98cc2
Author: amaya <gi...@sapphire.in.net>
Date:   2016-09-27T06:58:33Z

    integrate chi2 and SNR into hivemall.spark

commit aa7d5299739349b49ef4f50cc2c1969f5cb8a78f
Author: amaya <gi...@sapphire.in.net>
Date:   2016-09-27T07:02:02Z

    Merge 'master' into 'feature/feature_selection'

commit 1347de985ea6f8028c9d381f8827882ad39ad3a7
Author: amaya <gi...@sapphire.in.net>
Date:   2016-09-28T05:22:05Z

    refine feature selection in spark integration

commit 8e2842cf8c272642feaa76bf95e8fa463b0322dc
Author: amaya <gi...@sapphire.in.net>
Date:   2016-09-28T05:24:19Z

    refine tests

commit 4cfa4e5ac15a6535b187c23616c205696a1cd13b
Author: amaya <gi...@sapphire.in.net>
Date:   2016-09-28T09:26:01Z

    mod SNR for corner cases

commit 80be81ecf92cd4675dcdfaa5f456d84d484d6c44
Author: amaya <gi...@sapphire.in.net>
Date:   2016-09-28T11:01:08Z

    minor fix

commit 8d9f0d4c00758324029d342eb4b892e046ca4a49
Author: amaya <gi...@sapphire.in.net>
Date:   2016-09-29T02:02:14Z

    minor fix

commit ce4a48980e33b9f16c74a62fcea6878f28b9c08b
Author: amaya <gi...@sapphire.in.net>
Date:   2016-09-30T08:05:20Z

    change method of testing for spark

commit 67ba9631af3c231b7abd145134d17237b6aca0a5
Author: myui <yu...@gmail.com>
Date:   2016-11-21T09:19:45Z

    Merge branch 'feature/feature_selection' of
    https://github.com/amaya382/hivemall into feature_selection
    
    # Conflicts:
    #	core/src/main/java/hivemall/utils/hadoop/HiveUtils.java
    #	core/src/main/java/hivemall/utils/math/StatsUtils.java
    #	spark/spark-1.6/src/main/scala/org/apache/spark/sql/hive/GroupedDataEx.scala
    #	spark/spark-1.6/src/test/scala/org/apache/spark/sql/hive/HivemallOpsSuite.scala
    #	spark/spark-2.0/src/main/scala/org/apache/spark/sql/hive/HivemallGroupedDataset.scala
    #	spark/spark-2.0/src/test/scala/org/apache/spark/sql/hive/HivemallOpsSuite.scala

commit e44a413e5fd4270af53895fceec27ccff3d63a73
Author: myui <yu...@gmail.com>
Date:   2016-11-21T10:02:27Z

    Updated license headers

commit 6549ef5104883a9529dfd9fc52b2b24843076fbb
Author: amaya <am...@users.noreply.github.com>
Date:   2016-11-23T12:16:10Z

    Add feature selection gitbook (#386)

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---