You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pinot.apache.org by Pinot Slack Email Digest <sn...@apache.org> on 2021/03/22 02:00:22 UTC

Apache Pinot Daily Email Digest (2021-03-21)

### _#general_

  
 **@kis:** @kis has joined the channel  
 **@sosyalmedya.oguzhan:** The first question; When we enable
`aggregateMetrics` to pre-aggregation as it is consumed, pinot aggregates data
based on fields which defined in `dimensionFieldSpecs` and
`dateTimeFieldSpecs` . Can pinot aggregates data only based on fields which
defined in `dimensionFieldSpecs` while applying pre-aggregation using
`aggregateMetrics?` The second question; We can set time to generate segments
for real-time table using `realtime.segment.flush.threshold.time` config.
Let's assume current hour is 10:25. When i set
`realtime.segment.flush.threshold.time` to `1 hour`, pinot creates segment
with startTime 10:25, and it will close this segment when time is 11:25. As a
result, start/end time of that segment is 10:25-11:25. But when the new hour
starts, I want pinot to close segment.. Start/end time of that segment should
be 10:00-11:00. How can i achieve that?  
**@g.kishore:** 1\. aggregating only on dimensionFieldSpecs can result in
wrong results right. If you want aggregation on specific dimensions - use
startree index 2\. Use realtime to offline converter minion task. This will
periodically merge multiple segments and create time partitioned segments  
**@sosyalmedya.oguzhan:** When i use the star-three index, pinot only looks
the start-tree document, right? It still manages time boundary using time
column. So, if i have columns like that; ```dimensions: sellerId, brand,
category, productId, productName metrics: totalCount timeFields: eventDate,
orderDate``` i want to get aggregated results based on orderDate, brand,
category, productId for specific seller, category or brand. Some possible
queries; ```select productId, sum(totalCOunt) from t where sellerId = x and
orderDate > Y and orderDate < Z and category = ' group by productId select
category, sum(totalCOunt) from t where sellerId = x and orderDate > Y and
orderDate < Z group by category select brand, sum(totalCOunt) from t where
sellerId = x and orderDate > Y and orderDate < Z group by brand``` I have to
create star-tree index on sellerId, brand, category, productId and orderDate,
right? In that case, what happens when i want to get productName?  
**@sosyalmedya.oguzhan:** And also, we can only create one index on a column,
right?  
**@g.kishore:** That’s right. You can create index on a specific column  
 **@allison.t.murphy22:** @allison.t.murphy22 has joined the channel  
 **@nhat:** @nhat has joined the channel  
 **@rahul.kabra.corp:** @rahul.kabra.corp has joined the channel  

###  _#random_

  
 **@kis:** @kis has joined the channel  
 **@allison.t.murphy22:** @allison.t.murphy22 has joined the channel  
 **@nhat:** @nhat has joined the channel  
 **@rahul.kabra.corp:** @rahul.kabra.corp has joined the channel  

###  _#feat-better-schema-evolution_

  
 **@nhat:** @nhat has joined the channel  

###  _#troubleshooting_

  
 **@kis:** @kis has joined the channel  
 **@allison.t.murphy22:** @allison.t.murphy22 has joined the channel  
 **@nhat:** @nhat has joined the channel  
 **@rahul.kabra.corp:** @rahul.kabra.corp has joined the channel  

###  _#pinot-dev_

  
 **@nhat:** @nhat has joined the channel  

###  _#pinot-startup_

  
 **@ravi.maddi:** @ravi.maddi has joined the channel  
 **@vallamsetty:** @vallamsetty has joined the channel  
 **@mayanks:** @mayanks has joined the channel  
 **@vallamsetty:** Hey Ravi.. It's good talking to you...  
 **@vallamsetty:** Let's sync up sometime US friendly time with Mayank and we
can help with your env..  
\--------------------------------------------------------------------- To
unsubscribe, e-mail: dev-unsubscribe@pinot.apache.org For additional commands,
e-mail: dev-help@pinot.apache.org