You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "benj (Jira)" <ji...@apache.org> on 2019/11/29 15:18:00 UTC

[jira] [Commented] (DRILL-6963) create/aggregate/work with array

    [ https://issues.apache.org/jira/browse/DRILL-6963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16985067#comment-16985067 ] 

benj commented on DRILL-6963:
-----------------------------

For the second point (arry_agg), in attempt of an eventual official function, here is a simple implementation that can do that (without possibility to _DISTINCT_ or _ORDER BY_) 
{code:java}
package org.apache.drill.contrib.function;

import io.netty.buffer.DrillBuf;
import org.apache.drill.exec.expr.DrillAggFunc;
import org.apache.drill.exec.expr.annotations.FunctionTemplate;
import org.apache.drill.exec.expr.annotations.FunctionTemplate.FunctionScope;
import org.apache.drill.exec.expr.annotations.FunctionTemplate.NullHandling;
import org.apache.drill.exec.expr.annotations.Output;
import org.apache.drill.exec.expr.annotations.Param;
import org.apache.drill.exec.expr.annotations.Workspace;
import org.apache.drill.exec.expr.holders.*;

import javax.inject.Inject;

// If dataset is too large, need : ALTER SESSION SET `planner.enable_hashagg` = false
public class ArrayAgg {

// STRING NULLABLE //	
@FunctionTemplate(
    name = "array_agg",
    scope = FunctionScope.POINT_AGGREGATE,
    nulls = NullHandling.INTERNAL)
public static class NullableVarChar_ArrayAgg implements DrillAggFunc {
  @Param NullableVarCharHolder input;
  @Workspace ObjectHolder agg;
  @Output org.apache.drill.exec.vector.complex.writer.BaseWriter.ComplexWriter out;
  @Inject DrillBuf buffer;

  @Override public void setup() {	  
    agg = new ObjectHolder();
  }

  @Override public void reset() {
    agg = new ObjectHolder();
  }

  @Override public void add() {
    org.apache.drill.exec.vector.complex.writer.BaseWriter.ListWriter listWriter;
    if (agg.obj == null) {
      agg.obj = out.rootAsList();
    }
    
    if ( input.isSet == 0 )
    	return;
    
    org.apache.drill.exec.expr.holders.VarCharHolder rowHolder = new org.apache.drill.exec.expr.holders.VarCharHolder();
    byte[] inputBytes = org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.getStringFromVarCharHolder( input ).getBytes( com.google.common.base.Charsets.UTF_8 );
    buffer.reallocIfNeeded(inputBytes.length); 
    buffer.setBytes(0, inputBytes);
    rowHolder.start = 0; 
    rowHolder.end = inputBytes.length; 
    rowHolder.buffer = buffer;    
    
    listWriter = (org.apache.drill.exec.vector.complex.writer.BaseWriter.ListWriter) agg.obj;
    listWriter.varChar().write( rowHolder );	  
  }

  @Override public void output() {
	  ((org.apache.drill.exec.vector.complex.writer.BaseWriter.ListWriter) agg.obj).endList();
  }
}

// INTEGER NULLABLE //
@FunctionTemplate(
 name = "array_agg",
 scope = FunctionScope.POINT_AGGREGATE,
 nulls = NullHandling.INTERNAL)
public static class NullableInt_ArrayAgg implements DrillAggFunc {
	@Param NullableIntHolder input;
	@Workspace ObjectHolder agg;
	@Output org.apache.drill.exec.vector.complex.writer.BaseWriter.ComplexWriter out;
	@Inject DrillBuf buffer;
	
	@Override public void setup() {
		  
	 agg = new ObjectHolder();
	}
	
	@Override public void reset() {
	 agg = new ObjectHolder();
	}
	
	@Override public void add() {
	 org.apache.drill.exec.vector.complex.writer.BaseWriter.ListWriter listWriter;
	 if (agg.obj == null) {
	   agg.obj = out.rootAsList();
	 }
	 
    if ( input.isSet == 0 )
    	return;
	 
	 listWriter = (org.apache.drill.exec.vector.complex.writer.BaseWriter.ListWriter) agg.obj;
	 listWriter.integer().writeInt( input.value );	  
	}
	
	@Override public void output() {
		  ((org.apache.drill.exec.vector.complex.writer.BaseWriter.ListWriter) agg.obj).endList();
	}
}

// ...
}
{code}
 

> create/aggregate/work with array
> --------------------------------
>
>                 Key: DRILL-6963
>                 URL: https://issues.apache.org/jira/browse/DRILL-6963
>             Project: Apache Drill
>          Issue Type: Wish
>          Components: Functions - Drill
>            Reporter: benj
>            Priority: Major
>
> * Add the possibility to build array (like : SELECT array[a1,a2,a3...]) - ideally work with all types
>  * Add a default array_agg (like : SELECT col1, array_agg(col2), array_agg(DISTINCT col2) FROM ... GROUP BY col1) ;  - ideally work with all types
>  * Add function/facilities/operator to work with array



--
This message was sent by Atlassian Jira
(v8.3.4#803005)