You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by mehant <gi...@git.apache.org> on 2014/04/23 01:51:12 UTC

[GitHub] incubator-drill pull request: DRILL-556: Implement aggregate funct...

GitHub user mehant opened a pull request:

    https://github.com/apache/incubator-drill/pull/56

    DRILL-556: Implement aggregate functions.

    Following aggregate functions are added.
    stddev()
    stddev_samp()
    stddev_pop()
    variance()
    var_samp()
    var_pop()

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/mehant/incubator-drill aggregate_functions

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-drill/pull/56.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #56
    
----
commit e4665ec40a047253077540d07d2610b98b41f4fc
Author: Mehant Baid <me...@gmail.com>
Date:   2014-04-22T23:42:06Z

    DRILL-556: Implement the following aggregate functions.
    stddev()
    stddev_samp()
    stddev_pop()
    variance()
    var_samp()
    var_pop()

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-drill pull request: DRILL-556: Implement aggregate funct...

Posted by amansinha100 <gi...@git.apache.org>.
Github user amansinha100 commented on a diff in the pull request:

    https://github.com/apache/incubator-drill/pull/56#discussion_r12080141
  
    --- Diff: exec/java-exec/src/main/codegen/templates/AggrTypeFunctions3.java ---
    @@ -0,0 +1,128 @@
    +/**
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +<@pp.dropOutputFile />
    +
    +
    +
    +<#list aggrtypes2.aggrtypes as aggrtype>
    +<#if aggrtype.className != "Avg">
    --- End diff --
    
    Instead of checking for exclusion of Avg,  it would be better to have a separate AggrThpyes3.tdd consisting of the new functions.  The general idea was that AggrTypes1 contains aggr functions that have 1 running workspace variable, AggrType2 contains aggr functions that have 2 running workspace variables.  Since stddev, variance have 3 running workspace variables, why not put them in their own  tdd  file ... 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-drill pull request: DRILL-556: Implement aggregate funct...

Posted by amansinha100 <gi...@git.apache.org>.
Github user amansinha100 commented on a diff in the pull request:

    https://github.com/apache/incubator-drill/pull/56#discussion_r12080233
  
    --- Diff: exec/java-exec/src/main/codegen/templates/AggrTypeFunctions3.java ---
    @@ -0,0 +1,128 @@
    +/**
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +<@pp.dropOutputFile />
    +
    +
    +
    +<#list aggrtypes2.aggrtypes as aggrtype>
    +<#if aggrtype.className != "Avg">
    +<@pp.changeOutputFile name="/org/apache/drill/exec/expr/fn/impl/gaggr/${aggrtype.className}Functions.java" />
    +
    +<#include "/@includes/license.ftl" />
    +
    +<#-- A utility class that is used to generate java code for aggr functions such as stddev, variance -->
    +
    +/*
    + * This class is automatically generated from AggrTypeFunctions2.tdd using FreeMarker.
    + */
    +
    +package org.apache.drill.exec.expr.fn.impl.gaggr;
    +
    +import org.apache.drill.exec.expr.DrillAggFunc;
    +import org.apache.drill.exec.expr.annotations.FunctionTemplate;
    +import org.apache.drill.exec.expr.annotations.FunctionTemplate.NullHandling;
    +import org.apache.drill.exec.expr.annotations.FunctionTemplate.FunctionScope;
    +import org.apache.drill.exec.expr.annotations.Output;
    +import org.apache.drill.exec.expr.annotations.Param;
    +import org.apache.drill.exec.expr.annotations.Workspace;
    +import org.apache.drill.exec.expr.holders.*;
    +import org.apache.drill.exec.record.RecordBatch;
    +
    +@SuppressWarnings("unused")
    +
    +public class ${aggrtype.className}Functions {
    +	static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(${aggrtype.className}Functions.class);
    +
    +<#list aggrtype.types as type>
    +
    +<#if aggrtype.aliasName == "">
    +@FunctionTemplate(name = "${aggrtype.funcName}", scope = FunctionTemplate.FunctionScope.POINT_AGGREGATE)
    +<#else>
    +@FunctionTemplate(names = {"${aggrtype.funcName}", "${aggrtype.aliasName}"}, scope = FunctionTemplate.FunctionScope.POINT_AGGREGATE)
    +</#if>
    +
    +public static class ${type.inputType}${aggrtype.className} implements DrillAggFunc{
    +
    +  @Param ${type.inputType}Holder in;
    +  @Workspace ${type.movingAverageType}Holder avg;
    +  @Workspace ${type.movingDeviationType}Holder dev;
    +  @Workspace ${type.countRunningType}Holder count;
    +  @Output ${type.outputType}Holder out;
    +
    +  public void setup(RecordBatch b) {
    +  	avg = new ${type.movingAverageType}Holder();
    +    dev = new ${type.movingDeviationType}Holder();
    +    count = new ${type.countRunningType}Holder();
    +
    +    // Initialize the workspace variables
    +    avg.value = 0;
    +    dev.value = 0;
    +    count.value = 1;
    +  }
    +
    +  @Override
    +  public void add() {
    +	<#if type.inputType?starts_with("Nullable")>
    +	  sout: {
    +	  if (in.isSet == 0) {
    +	   // processing nullable input and the value is null, so don't do anything...
    +	   break sout;
    +	  }
    +	</#if>
    +
    +    // Welford's approach to compute standard deviation
    --- End diff --
    
    Welford's method does the computation online (streaming) and it looks simple... so I am wondering is there a catch ?         It is computing the average each time a row is processed as opposed to doing it once at the end..so we would have to see how it performs.  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-drill pull request: DRILL-556: Implement aggregate funct...

Posted by mehant <gi...@git.apache.org>.
Github user mehant closed the pull request at:

    https://github.com/apache/incubator-drill/pull/56


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-drill pull request: DRILL-556: Implement aggregate funct...

Posted by mehant <gi...@git.apache.org>.
Github user mehant commented on the pull request:

    https://github.com/apache/incubator-drill/pull/56#issuecomment-42269971
  
    merged as eedb4d7c47c0cc021f8c434e6910a8574104531e


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---