You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@madlib.apache.org by fmcquillan99 <gi...@git.apache.org> on 2016/03/25 23:06:08 UTC

[GitHub] incubator-madlib pull request: misc doc changes for 1 dot 9

GitHub user fmcquillan99 opened a pull request:

    https://github.com/apache/incubator-madlib/pull/33

    misc doc changes for 1 dot 9

    https://issues.apache.org/jira/browse/MADLIB-982
    
    1) Front page of user docs has links to madlib.net and some broken links
    mainpage.dox.in
    
    2) doc changes to
    pca
    elastic net
    summary
    svd

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/fmcquillan99/incubator-madlib misc-1-dot-9-doc-edits

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-madlib/pull/33.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #33
    
----
commit 5c40ab4af0b7113343a303b490a84728a8c90bbc
Author: Frank McQuillan <fm...@pivotal.io>
Date:   2016-03-25T22:02:37Z

    misc doc changes for 1 dot 9

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-madlib pull request: misc doc changes for 1 dot 9

Posted by iyerr3 <gi...@git.apache.org>.
Github user iyerr3 commented on a diff in the pull request:

    https://github.com/apache/incubator-madlib/pull/33#discussion_r57504745
  
    --- Diff: src/ports/postgres/modules/pca/pca.sql_in ---
    @@ -324,13 +341,20 @@ string should be double-quoted; in this case the input would be '"MyTable"').
     \ref background_pca "Technical Background"), sparse matrices almost always
     become dense during the training process. Thus, this implementation
     automatically densifies sparse matrix input, and there should be no expected
    - performance improvement in using sparse matrix input over dense matrix input.
    -
    -- If both <em>lanczos_iter</em> and proportion of variance (via the 
    -<em>grouping_cols</em>) is defined, <em>lanczos_iter</em> will 
    -take precedence in determining the number of principal components (i.e. the 
    -number of principal components will not be greater than <em>lanczos_iter</em> 
    -even if the target proportion is not reached).
    +performance improvement in using sparse matrix input over dense matrix input.
    +
    +- For the parameter 'components_param', INTEGER and FLOAT are
    +interpreted differently.  A special case to be aware of:
    +'components_param' = 1 (INTEGER) will return 1 principle
    --- End diff --
    
    again, principle -> principal


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-madlib pull request: misc doc changes for 1 dot 9

Posted by iyerr3 <gi...@git.apache.org>.
Github user iyerr3 commented on a diff in the pull request:

    https://github.com/apache/incubator-madlib/pull/33#discussion_r57504858
  
    --- Diff: src/ports/postgres/modules/summary/summary.sql_in ---
    @@ -285,6 +294,16 @@ string should be double-quoted; in this case the input would be '"MyTable"').
         in a slow but exact method. The most frequent values are computed using a
         faithful implementation that preserves the approximation guarantees of
         the Cormode/Muthukrishnan method (more information in \ref grp_mfvsketch).
    +- Summary statistics are calculated for each grouping 
    --- End diff --
    
    We're repeating this text in the argument list and in notes. That could be fine, but checking if it's intentional? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-madlib pull request: misc doc changes for 1 dot 9

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/incubator-madlib/pull/33


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-madlib pull request: misc doc changes for 1 dot 9

Posted by iyerr3 <gi...@git.apache.org>.
Github user iyerr3 commented on the pull request:

    https://github.com/apache/incubator-madlib/pull/33#issuecomment-201615733
  
    LGTM pending some of the spelling corrections


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-madlib pull request: misc doc changes for 1 dot 9

Posted by iyerr3 <gi...@git.apache.org>.
Github user iyerr3 commented on a diff in the pull request:

    https://github.com/apache/incubator-madlib/pull/33#discussion_r57503961
  
    --- Diff: src/ports/postgres/modules/elastic_net/elastic_net.sql_in ---
    @@ -130,7 +130,13 @@ the dependent variable expression to the <em>excluded</em> string.</DD>
     <DD>BOOLEAN, default: TRUE. Whether to normalize the data. Setting this to TRUE usually yields better results and faster convergence.</DD>
     
     <DT>grouping_col (optional)</DT>
    -<DD>TEXT, default: NULL. <em>Not currently implemented. Any non-NULL value is ignored.</em> An expression list used to group the input dataset into discrete groups, running one regression per group. Similar to the SQL <tt>GROUP BY</tt> clause. When this value is NULL, no grouping is used and a single result model is generated.</DD>
    +<DD>TEXT, default: NULL. 
    +
    +@note <em>Not currently implemented. Any non-NULL value is ignored.
    +Grouping support will be added in a future release. </em>
    +An expression list used to group the input dataset into discrete groups, 
    --- End diff --
    
    Maybe we shouldn't say anything about the expression list, since grouping isn't implemented yet. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-madlib pull request: misc doc changes for 1 dot 9

Posted by iyerr3 <gi...@git.apache.org>.
Github user iyerr3 commented on a diff in the pull request:

    https://github.com/apache/incubator-madlib/pull/33#discussion_r57504661
  
    --- Diff: src/ports/postgres/modules/pca/pca.sql_in ---
    @@ -177,19 +177,29 @@ contain values between 1 and <em>M</em>.</DD>
     
     <DT>components_param</DT>
     <DD>INTEGER or FLOAT.  The parameter to control the number of principal 
    -components to calculate from the input data. If components_param is INTEGER, 
    -it is used for denoting the number of principal components (<em>k</em>) to 
    -compute. If components_param is FLOAT, the algorithm would return enough 
    +components to calculate from the input data. If 'components_param' is INTEGER, 
    +it is used to denote the number of principal components (<em>k</em>) to 
    +compute. If 'components_param' is FLOAT, the algorithm will return enough 
     principal vectors so that the ratio of the sum of the eigenvalues collected 
    -thus far to the sum of all eigenvalues is greater than this parameter.
    -This value has to be either a positive INTEGER or a FLOAT in the range 
    -(0.0,1.0]</DD>
    +thus far to the sum of all eigenvalues is greater than this parameter 
    +(proportion of variance).  The value of 'components_param' must be either 
    +a positive INTEGER or a FLOAT in the range (0.0,1.0]</DD>
    +
    +@note The difference in interpretation between INTEGER and FLOAT was 
    --- End diff --
    
    Replace everywhere: principle -> principal


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-madlib pull request: misc doc changes for 1 dot 9

Posted by fmcquillan99 <gi...@git.apache.org>.
Github user fmcquillan99 commented on the pull request:

    https://github.com/apache/incubator-madlib/pull/33#issuecomment-201618779
  
    thanks for review. 
    
    I made the suggested changes


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---