You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2016/03/07 23:39:40 UTC

[jira] [Commented] (MAHOUT-1800) Pare down Casstag overuse

    [ https://issues.apache.org/jira/browse/MAHOUT-1800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15183897#comment-15183897 ] 

ASF GitHub Bot commented on MAHOUT-1800:
----------------------------------------

GitHub user andrewpalumbo opened a pull request:

    https://github.com/apache/mahout/pull/183

    MAHOUT-1800: Pare down Classtag overuse

    Currently, almost every operator requires an implicit parameter for the classtag context bound of drm rowset key type, even for things like drmA + drmB.
    
    in reality though the DAG can already infer that similarly to e.g. it infers product geometry because classtags are already embedded in the logical plan.
    
    for example, `classtag(drmA+drmB) == classtag(drmA) == classtag(drmB)`.
    
    Not only does the DAG already contain this information, but also it opens doors to a loss of inference, since the optimizer doesn't verify that the new context bound is actually valid by retracing the inference. So any operation may introduce an invalid row key type, and as a consequence, invalid optimization information, without any further checks.
    
    This patch does the following:
    (1) eliminates ClassTag[K] context bound in majority of operations
    (2) add keyClassTag:ClassTag[K] property getter to the DrmLike[K] trait itself
    (3) ensures lazy inference of returned key parameter classtag via DAG inference.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/andrewpalumbo/mahout MAHOUT-1800

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/mahout/pull/183.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #183
    
----
commit e4a358d8adeb8878bc67e7bdf11e9c59f6003365
Author: Andrew Palumbo <ap...@apache.org>
Date:   2016-03-07T22:32:14Z

    Pare down Classtag overuse

----


> Pare down Casstag overuse
> -------------------------
>
>                 Key: MAHOUT-1800
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1800
>             Project: Mahout
>          Issue Type: Improvement
>    Affects Versions: 0.11.1
>            Reporter: Andrew Palumbo
>             Fix For: 0.11.2
>
>
> currently, almost every operator requires implicit parameter for the classtag context bound of drm rowset key type, even for things like drmA + drmB.
> in reality though DAG can already infer that similarly to e.g. it infers product geometry because classtags are already embedded in the logical plan. 
> for example, {{classtag(drmA+drmB) == classtag(drmA) == classtag(drmB)}}. 
> Not only does the DAG already contain this information, but also it opens doors to a loss of inference, since the optimizer doesn't verify that the new context bound is actually valid by retracing the inference. So any operation may introduce an invalid row key type, and as a consequence, invalid optimization information, without any further checks. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)