You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pinot.apache.org by Pinot Slack Email Digest <sn...@apache.org> on 2021/06/14 02:00:17 UTC

Apache Pinot Daily Email Digest (2021-06-13)

### _#general_

  
 **@agnihotrisuryansh55:** @agnihotrisuryansh55 has joined the channel  
 **@ashish:** Is this issue resolved - ?  
**@mayanks:** Yes the enhancement was merged  
**@ashish:** Ok, thanks. Is there an example of multi argument aggregation
function that I can refer to?  
**@mayanks:**  
**@ashish:** That however takes only a single column as input.  
**@ashish:** For aggregation function with multiple column names as input, the
Map<ExpressionContext, BlockValSet> blockValSetMap parameter in the
aggregation method may not be sufficient.  
**@ashish:** As each value for column1 could be associated with multiple
values of column2.  
**@ashish:** Any idea on how to handle this?  
**@mayanks:** What's your requirement?  
**@ashish:** I was looking to do the following query: select bucket(ts, 1
min), avg_then_sum(ts, dimensioncolumn, value) group by bucket(ts, 1 min)  
**@mayanks:** Theta sketch is just an example. Custom aggr functions can now
take any number of args  
**@ashish:** Basically, within each 1 min bucket, I want to average all metric
values grouped by dimension and then sum those averaged values across the
dimension column.  
**@mayanks:** You planning to write a custom aggr function?  
**@ashish:** Yes, unless there is a better alternative  
**@ashish:** I was told that subqueries are not yet supported.  
**@mayanks:** In `Map<ExpressionContext, BlockValSet> blockValSetMap`, each
entry in the map can be a column  
**@mayanks:** or expression on the column  
**@ashish:** Yes. However, each blockset’s getIntValuesSV and other methods
return an array of values.  
**@ashish:** So for column1 and column2 (assuming they are both single value
int columns), getIntValuesSV will return same number of elements? And they
will be in the same order (by docIds)?  
**@ashish:** If that’s the case it should work for me.  
**@mayanks:** Yes same order  
**@ashish:** Got it - so individual array elements of both columns will
correspond to the same docId. Right?  
**@mayanks:** Yep  
**@ashish:** Thanks for confirming.  
**@ashish:** I plan to proceed along - let me know if you have any alternate
ways / suggestions, etc.  
**@ashish:** appreciate your help.  
**@mayanks:** :+1:  

###  _#random_

  
 **@agnihotrisuryansh55:** @agnihotrisuryansh55 has joined the channel  

###  _#troubleshooting_

  
 **@agnihotrisuryansh55:** @agnihotrisuryansh55 has joined the channel  
 **@jmeyer:** Hello Is there a way to use lookups (dimTable) inside a WHERE
clause ? Something like `SELECT SUM(value) FROM table WHERE user IN
LOOKUP('group', 'user', 'groupId', '<groupId>')` ? (which isn't valid)
Basically my goal is to fetch a list of 'users' in the dimTable using a
'group' identifier and filtering on those (in the main table)  
**@jackie.jxt:** This is not a look-up query. You can take a look at `Use
IdSet for Id Filtering` design: . This query can be modeled as a subquery and
solved with the `IN_SUBQUERY` function  
 **@syedakram93:** Any Developments in progress to support in pinot reg
sorting and partitioning offline data(orc) present in Hdfs ?  
 **@syedakram93:** If yes how much time will it take to release this feature ?  

###  _#pinot-dev_

  
 **@agnihotrisuryansh55:** @agnihotrisuryansh55 has joined the channel  
 **@syedakram93:** Any Developments in progress to support in pinot reg
sorting and partitioning offline data(orc) present in Hdfs ?  
 **@syedakram93:** If yes how much time will it take to release this feature ?  
\--------------------------------------------------------------------- To
unsubscribe, e-mail: dev-unsubscribe@pinot.apache.org For additional commands,
e-mail: dev-help@pinot.apache.org