You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pinot.apache.org by Pinot Slack Email Digest <sn...@apache.org> on 2021/05/12 02:00:19 UTC

Apache Pinot Daily Email Digest (2021-05-11)

### _#general_

  
 **@humengyuk18:** What kind of index should I use if I have datetime string
column to enable faster ranged query? Will a ranged index help?  
**@fx19880617:** range index only works for numbers.  
**@fx19880617:** for date string, you can try to have a separated column to
convert the datestring to seconds or days value  
**@fx19880617:** also if you have the data partitioned by date for segments,
then range query should mostly fall into to the individual segment  
**@humengyuk18:** I see. But the problem for our use case is, superset time
column can only be datetime, that’s why we converted epoch_ms to datetime
during ingestion.  
**@fx19880617:** oh?  
**@fx19880617:** I thought superset can have epoch_ms and epoch_s?  
**@fx19880617:** something like:  
**@humengyuk18:** For temporal column, it has to datetime or datetime like.  
**@humengyuk18:** I guess a long type can’t work as temporal?  
**@fx19880617:** you can check mark `is temporal`  
**@fx19880617:**  
**@fx19880617:** this is one example  
**@fx19880617:** this is the query superset generated:  
**@humengyuk18:** It works, the description on superset is a little
misleading, I thought it must be datetime type.  
**@humengyuk18:** Why using epoch_ms time column is much slower than a
datetime string column?  
**@humengyuk18:** The same superset cost 3675 ms using epoch_ms column, but
only cost 575 ms using a datetime string column.  
**@fx19880617:** hmm, do you have the query generated for the graph?  
**@fx19880617:** it’s possible that the group by transformation make the
computation complicated  
**@humengyuk18:** The only difference for the query is the time filter.  
**@humengyuk18:** ```SELECT count(DISTINCT device_sn) AS count_1 FROM
pad_interaction WHERE "time" >= 1620705960000 AND "time" < 1620706560000 AND
company_name = 'xxx' AND action_value = 'mqtt_ping' AND action_message =
'success' AND ((regexp_like(device_sn, '^WAL.*'))) ORDER BY count(DISTINCT
device_sn) DESC;```  
**@fx19880617:** oh? can you paste the queries?  
**@humengyuk18:** ```SELECT count(DISTINCT device_sn) AS count_1 FROM
pad_interaction WHERE ds >= '2021-05-11 12:03:00' AND ds < '2021-05-11
12:13:00' AND company_name = 'xxx' AND action_value = 'mqtt_ping' AND
action_message = 'success' AND ((regexp_like(device_sn, '^WAL.*'))) ORDER BY
count(DISTINCT device_sn) DESC```  
**@fx19880617:** hmm  
**@fx19880617:** this query seems only hit the most recent data  
**@fx19880617:** it’s weird that there are some big difference  
**@humengyuk18:** numEntriesScannedInFilter difference is huge, datetime
string has only 9370339 entries scanned, epoch_ms has 262577613 entries
scanned.  
**@fx19880617:** oh, because it’s ms not seconds  
**@fx19880617:** then add a second epoch value column should help  
**@fx19880617:** also,  
**@fx19880617:** do you have any sort index?  
**@humengyuk18:** No, no index is configured.  
**@fx19880617:** ic, then adding inverted index may help as well for columns
in filtering  
**@humengyuk18:** Thanks, I will add some inverted index.  
**@g.kishore:** Why not range?  
**@humengyuk18:** Can range index used for string columns? I didn’t see any
description in the documentation.  
**@fx19880617:** I feel sortIndex on time and inverted index on company_name,
action_value, action_message may help  
**@humengyuk18:** inverted index can’t apply on multivalue columns?  
**@fx19880617:** inv index works for multi-value column  
 **@david.gyoker:** @david.gyoker has joined the channel  
 **@pedro.cls93:** Hello, Does Pinot support defining a computed field
(metric) based on a field that does not appear in the schema but exists in the
ingestion message? This is a realtime table if that makes a difference.  
**@npawar:** Yes this is supported  
**@pedro.cls93:** So the transformationFunction can receive as a parameter a
field that does not exist in the schema? Is there documentation on this use-
case?  
**@npawar:** Yes. If you have a function like bar="reverse(foo)", where bar is
in the Pinot schema, but foo is not in the Pinot schema, it will work if foo
is present in the source record. Pinot will look for all arguments to the
function in the row  
**@npawar:** Dunno if this is specifically documented, there might be a
mention on the page with the transform functions documentation  
 **@pedro.cls93:** What are the consequences of defining a dimension as a
String of length Integer.MaxSize and creating a json index based on it? What
hardware considerations need to be made if any? What is the performance
impact?  
 **@bigdatalabsg:** @bigdatalabsg has joined the channel  
 **@kevin.smith:** @kevin.smith has joined the channel  
 **@aiyer:** Hi Team -- A basic question , If we use S3 for storing the
segments, how does pinot take care of query latency since there will be netw
call involved?  
**@pedro.cls93:** Someone with more experience can chime in but Pinot
controller & server instances will have a local copy of the segments to handle
queries.  
**@pedro.cls93:** See this thread  
**@g.kishore:** Only Pinot server will have the local copy, not controller  
**@aiyer:** ok.. so does it mean the cluster must have enough memory combined
to host all the data that is getting ingested?  
**@aiyer:** and that would keep growing in production, so how do we generally
scale that ? Should we be looking at sacling up number of servers as traffic
and data grows?  
**@g.kishore:** No, all the data is stored on disk and mmaped. So you need
enough disk space  
**@aiyer:** ok.. what is we use S3 ?  
**@g.kishore:** s3 the backup store  
**@aiyer:** oh ok.. so we will have data in disk of the servers +the backup
store.  
**@mayanks:** Correct  
**@pedro.cls93:** @g.kishore So when configuring S3 or some other object store
as deep store for segments means that disk space in servers is for caching
purposes only?  
**@pedro.cls93:** Are there any metrics/ways to see if see if a given server
is under-provisioned for caching segments?  
**@mayanks:** Yes, servers use local disk for better performance, as of now  
**@mayanks:** There’s an endpoint in server to get table size.  
**@pedro.cls93:** Hello everyone, What does Pinot store in zookeeper metadata.
I currently have 2GB out of 2.5GB of disk used up (78.5%) in my zookeeper
instance. Should this be a cause for concern?  
**@mayanks:** Cluster state, segment metadata etc. do you have too many
segments?  
**@mayanks:** Whatever is visible via ZK in the UI is stored inside ZK  
**@pedro.cls93:** What is too many segments? I have 1 table only with 3842
segments.  
**@mayanks:** Yeah 3.8k segments might lead to 2GB, each segment has metadata
stored in ZK  
**@mayanks:** Too many is relative of course, here I meant enough to justify
2GB disk storage  
**@pedro.cls93:** Understood. Can zookeeper information be offloaded to hdfs
like the segments?  
**@mayanks:** Not sure if ZK supports that If it did, then you will incur
latency for ZK read/writes  
 **@karinwolok1:** :wave: Let's give a big warm welcome to all the new Pinot
community members! :wine_glass: :point_right: We'd love to learn a little bit
about who you are and what brought you here! Please take a moment and
introduce yourself to the community! :heart: @vishal.sharma09890 @bin.huangxb
@libhojani @darkyin87 @tony @gaurav.eca @guido.schmutz @k.santhiswarup
@richhickey @arthurnamiasdecrasto @onderaltinkurt @rohithuppala
@shaileshjha061 @sumit.m @david.gyoker @bigdatalabsg @kevin.smith  
 **@vishal.sharma09890:** Hi, thanks for having us here. I am a software
developer from India, just started in the tech industry last year, graduating
in 2020. My field of interest is distributed systems and i work majorly on go.  
 **@k.santhiswarup:** Hello Guys! I am checking out Pinot repo and I noticed
the code for thrideye is not in incubator-pinot anymore and all the references
to thirdeye are broken in the docs:disappointed: . Is this intentional?  
**@g.kishore:** Yes, it was intentional. its moved to  
**@g.kishore:** This was needed as part of Pinot graduation process in Apache  
**@k.santhiswarup:** Thanks @g.kishore for clarification.  
 **@kmvb.tau:** does pinot support window functions like presto?  
**@mayanks:** Not yet, could you file an issue describing the use case?  
**@kmvb.tau:** ok sure.. will do  
**@g.kishore:** No, we don't support window functions. You can achieve that
using presto -pinot connector  
**@kmvb.tau:** I understand using presto connector for join tables.. if pinot
supports all single table functions (window ) then it will be helpful. our use
case: To find avg response time for support tickets . We use lag window
functions over audit data . If we use presto connector then large amount of
data will be transferred to presto nodes and will increase query time as well.  
**@g.kishore:** if you partition the data on ticketid and use partition based
assignment, you can achieve this in Pinot via UDF. we did this for distinct
count and it showed amazing improvement in performance @jackie.jxt ^^  
**@kmvb.tau:** Any document reference for data partitioning techniques ?
Partitioning data on ticketid will definitely help query performance.  
**@jackie.jxt:** Here are the doc for the partition based segment assignment:  
**@mkrishna.parimi:** @mkrishna.parimi has joined the channel  
 **@keweishang:** @keweishang has joined the channel  

###  _#random_

  
 **@david.gyoker:** @david.gyoker has joined the channel  
 **@bigdatalabsg:** @bigdatalabsg has joined the channel  
 **@kevin.smith:** @kevin.smith has joined the channel  
 **@mkrishna.parimi:** @mkrishna.parimi has joined the channel  
 **@keweishang:** @keweishang has joined the channel  

###  _#feat-presto-connector_

  
 **@kmvb.tau:** @kmvb.tau has joined the channel  
 **@keweishang:** @keweishang has joined the channel  
 **@keweishang:** @keweishang has left the channel  

###  _#troubleshooting_

  
 **@david.gyoker:** @david.gyoker has joined the channel  
 **@bigdatalabsg:** @bigdatalabsg has joined the channel  
 **@kevin.smith:** @kevin.smith has joined the channel  
 **@jmeyer:** Sorry for bumping the question, maybe it got shadowed by
following ones. > Hello > What is the recommended approach to getting the
"last non-null value" ? > Use a UDF ?  
**@g.kishore:** Last non null value for a key?  
**@g.kishore:** Or across lot of keys  
**@jmeyer:** Across lot of keys Like in a "periodic" timeseries (ex: 1 data
point per day), where some values (ex: some days) are null. How can we
efficiently fetch the last *non-null* value ?  
**@jmeyer:** *Date - Value* 05/01/2021 - null 04/01/2021 - null 03/01/2021 -
3.0 02/01/2021 - 1.0 *Output: 3.0*  
**@g.kishore:** Ah, got it  
**@jmeyer:** Maybe, it'd be simpler not to store null values and get the
`last` value ? Not sure that'd be more efficient that way  
**@g.kishore:** Select date, Val from T where F value >0 order by date desc
limit 1 should work  
**@g.kishore:** Writing a udf might be more efficient but see if this works
for now  
**@jmeyer:** Makes sense indeed ! Do you expect this query to be highly
performant ? Should a specific index be added ?  
**@jmeyer:** Actually, what about negative values ? We'd be interested in
those too ^^ I know there's some support for null values, as in IS NOT NULL -
but iirc, there's a performance penalty associated with this feature  
**@g.kishore:** No, not null index is very fast  
**@g.kishore:** So you can use that  
**@jmeyer:** So, enable this feature (`nullHandlingEnabled`) And query like
`SELECT value FROM table WHERE value IS NOT NULL ORDER BY date LIMIT 1`  
**@jmeyer:** Sounds right ?  
**@g.kishore:** yes  
**@jmeyer:** Great, thank you @g.kishore :slightly_smiling_face: Have a good
day !  
 **@keweishang:** @keweishang has joined the channel  

###  _#getting-started_

  
 **@keweishang:** @keweishang has joined the channel  
 **@keweishang:** @keweishang has left the channel  

###  _#minion-improvements_

  
 **@laxman:** I tried union approach. Pinot unit tests are failing. So, didn’t
feel confident to push it to test environment  
 **@laxman:** TIMES function returning double. But `clicks` is of integer
type. Wondering how it was working. And I feel this should fail earlier too.  
 **@laxman:** Here is the patch I tried  
 **@laxman:** ```diff --git a/pinot-
core/src/main/java/org/apache/pinot/core/segment/processing/utils/SegmentProcessorUtils.java
b/pinot-
core/src/main/java/org/apache/pinot/core/segment/processing/utils/SegmentProcessorUtils.java
index 920d75f7d..097873325 100644 \--- a/pinot-
core/src/main/java/org/apache/pinot/core/segment/processing/utils/SegmentProcessorUtils.java
+++ b/pinot-
core/src/main/java/org/apache/pinot/core/segment/processing/utils/SegmentProcessorUtils.java
@@ -66,22 +66,22 @@ public final class SegmentProcessorUtils { if
(fieldSpec.isSingleValueField()) { switch (dataType) { case INT: \-
fieldAssembler = fieldAssembler.name(name).type().intType().noDefault(); \+
fieldAssembler = fieldAssembler.name(name).type().optional().intType(); break;
case LONG: \- fieldAssembler =
fieldAssembler.name(name).type().longType().noDefault(); \+ fieldAssembler =
fieldAssembler.name(name).type().optional().longType(); break; case FLOAT: \-
fieldAssembler = fieldAssembler.name(name).type().floatType().noDefault(); \+
fieldAssembler = fieldAssembler.name(name).type().optional().floatType();
break; case DOUBLE: \- fieldAssembler =
fieldAssembler.name(name).type().doubleType().noDefault(); \+ fieldAssembler =
fieldAssembler.name(name).type().optional().doubleType(); break; case STRING:
\- fieldAssembler = fieldAssembler.name(name).type().stringType().noDefault();
\+ fieldAssembler = fieldAssembler.name(name).type().optional().stringType();
break; case BYTES: \- fieldAssembler =
fieldAssembler.name(name).type().bytesType().noDefault(); \+ fieldAssembler =
fieldAssembler.name(name).type().optional().bytesType(); break; default: throw
new RuntimeException("Unsupported data type: " + dataType); @@ -89,19 +89,19
@@ public final class SegmentProcessorUtils { } else { switch (dataType) {
case INT: \- fieldAssembler =
fieldAssembler.name(name).type().array().items().intType().noDefault(); \+
fieldAssembler =
fieldAssembler.name(name).type().optional().array().items().intType(); break;
case LONG: \- fieldAssembler =
fieldAssembler.name(name).type().array().items().longType().noDefault(); \+
fieldAssembler =
fieldAssembler.name(name).type().optional().array().items().longType(); break;
case FLOAT: \- fieldAssembler =
fieldAssembler.name(name).type().array().items().floatType().noDefault(); \+
fieldAssembler =
fieldAssembler.name(name).type().optional().array().items().floatType();
break; case DOUBLE: \- fieldAssembler =
fieldAssembler.name(name).type().array().items().doubleType().noDefault(); \+
fieldAssembler =
fieldAssembler.name(name).type().optional().array().items().doubleType();
break; case STRING: \- fieldAssembler =
fieldAssembler.name(name).type().array().items().stringType().noDefault(); \+
fieldAssembler =
fieldAssembler.name(name).type().optional().array().items().stringType();
break; default: throw new RuntimeException("Unsupported data type: " +
dataType);```  
 **@laxman:** cc: @fx19880617 @jackie.jxt @npawar Please take a look if you
get few minutes.  
 **@laxman:** Also, please let me know if there are any issues I can work to
fix in this conversion flow.  

###  _#product-launch_

  
 **@g.kishore:** @g.kishore has joined the channel  
 **@g.kishore:** @g.kishore has left the channel  
\--------------------------------------------------------------------- To
unsubscribe, e-mail: dev-unsubscribe@pinot.apache.org For additional commands,
e-mail: dev-help@pinot.apache.org