You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pinot.apache.org by Pinot Slack Email Digest <ap...@gmail.com> on 2022/03/19 02:00:25 UTC
Apache Pinot Daily Email Digest (2022-03-18)

### _#general_

  
 **@ysuo:** @ysuo has joined the channel  
 **@puneet.singh:** @puneet.singh has joined the channel  
 **@sdg.deep:** @sdg.deep has joined the channel  
 **@diana.arnos:** Hey there, I'm trying to run the config recommendation
engine and I didn't understand how can I fill the number of kafka partitions
that we already have. And there is no example of the parameter name . I tried
```"partitionRuleParams": { "KAFKA_NUM_MESSAGES_PER_SEC_PER_PARTITION": 0.7,
"KAFKA_NUM_PARTITIONS": 128 },``` But `KAFKA_NUM_PARTITIONS` is not
recognized. How can I tell the current number of kafka partitions we have?  
**@diana.arnos:** Also, even when I remove this property, I don't get the
`realTimeProvisionRecommendations`. And `2147483647` kafka partitions seem to
be a little bit too much :laughing: Here's the body I'm sending: ```{
"schema":{ "dimensionFieldSpecs": [ { "averageLength": 36, "cardinality":
900000000, "name": "responseId", "dataType": "STRING" }, { "averageLength":
36, "cardinality": 300000, "name": "formId", "dataType": "STRING" }, {
"averageLength": 36, "cardinality": 50000, "name": "channelId", "dataType":
"STRING" }, { "averageLength": 25, "cardinality": 5, "name":
"channelPlatform", "dataType": "STRING" }, { "averageLength": 36,
"cardinality": 7000, "name": "companyId", "dataType": "STRING" }, {
"cardinality": 2, "name": "submitted", "dataType": "BOOLEAN" }, {
"cardinality": 2, "name": "deleted", "dataType": "BOOLEAN" } ], "schemaName":
"responseCount" }, "queriesWithWeights":{ "select DATETIMECONVERT(createdAt,
'1:MILLISECONDS:SIMPLE_DATE_FORMAT:yyyy-MM-dd''T''HH:mm:ss.SSSZ',
'1:MILLISECONDS:SIMPLE_DATE_FORMAT:yyyy', '1:DAYS') as timeWindow,
channelPlatform, deleted, submitted, count(*) as total from responseCount
where companyId = '<redacted>' group by timeWindow, channelPlatform, deleted,
submitted order by timeWindow, channelPlatform limit 10000000": 1 },
"tableType": "REALTIME", "qps": 5, "latencySLA": 2000, "partitionRuleParams":
{ "KAFKA_NUM_MESSAGES_PER_SEC_PER_PARTITION": 0.7 }, "rulesToExecute": {
"recommendRealtimeProvisioning": true } }``` The response I get: ```{
"realtimeProvisioningRecommendations": {}, "segmentSizeRecommendations": {
"message": "Segment sizing for realtime-only tables is done via Realtime
Provisioning Rule", "numRowsPerSegment": 0, "numSegments": 0, "segmentSize": 0
}, "indexConfig": { "sortedColumnOverwritten": true, "invertedIndexColumns":
[], "noDictionaryColumns": [ "responseId" ], "rangeIndexColumns": [],
"sortedColumn": "companyId", "bloomFilterColumns": [ "companyId" ],
"onHeapDictionaryColumns": [], "varLengthDictionaryColumns": [ "formId",
"companyId", "channelPlatform", "channelId" ] }, "partitionConfig": {
"numKafkaPartitions": 2147483647, "numPartitionsRealtime": 1,
"partitionDimension": "", "numPartitionsOffline": 1,
"numPartitionsOfflineOverwritten": false, "numPartitionsRealtimeOverwritten":
false, "partitionDimensionOverwritten": false }, "flaggedQueries": {
"flaggedQueries": { "select DATETIMECONVERT(createdAt,
'1:MILLISECONDS:SIMPLE_DATE_FORMAT:yyyy-MM-dd''T''HH:mm:ss.SSSZ',
'1:MILLISECONDS:SIMPLE_DATE_FORMAT:yyyy', '1:DAYS') as timeWindow,
channelPlatform, deleted, submitted, count(*) as total from responseCount
where companyId = '<redacted>' group by timeWindow, channelPlatform, deleted,
submitted order by timeWindow, channelPlatform limit 10000000": "Warning:
Please verify if you need to pull out huge number of records for this query.
Consider using smaller limit than 100000" } }, "aggregateMetrics": false }```  
 **@adam.hutson:** @adam.hutson has joined the channel  
 **@npawar:** Pinot community member and committer @kharekartik has written
this nice blog about the Kinesis connector he worked on
:slightly_smiling_face:  

### _#random_

  
 **@ysuo:** @ysuo has joined the channel  
 **@puneet.singh:** @puneet.singh has joined the channel  
 **@sdg.deep:** @sdg.deep has joined the channel  
 **@adam.hutson:** @adam.hutson has joined the channel  

###  _#troubleshooting_

  
 **@ysuo:** @ysuo has joined the channel  
 **@puneet.singh:** @puneet.singh has joined the channel  
 **@sdg.deep:** @sdg.deep has joined the channel  
 **@luisfernandez:** hey friends, i have a question regarding `Table Consuming
Latency` I have been turning off and on various part of pinot to see how it
behaves, this time i decided to turn off for sometime the kafka app that
produces the records to pinot, i saw a latency increase when i turned off the
app and at least for p99, it was 160ms and now is over a minute, when things
like this happen when do you expect pinot to get back to its regular level
does it ever get back? I was thinking as the day goes by maybe and this topic
start to get less traffic then maybe things come down but I was wondering if
that somehow can come back any other way. Ofc this is still pretty fast but
I’m wondering what happens if I were to take down the app for a longer time
how could that impact the p99 times  
**@mayanks:** Assuming the > 1min latency you are referring to is for
consumption, I’d say the consumption catches up pretty fast, however, as you
can imagine it is a function of data size, number of partitions, number of
servers etc. I’d recommend testing it for practical scenarios you think you
will run into.  
**@luisfernandez:** this is 2 servers 16 partitions the kafka app is only 1
replica and it’s back to speed and we are processing 4k messages/sec  
**@walterddr:** can you share how you measure the "Table Consuming Latency"
during the time the app is turned off?  
**@luisfernandez:** `avg by (table)
(pinot_server_freshnessLagMs_XXthPercentile{kubernetes_namespace="$namespace"})`  
**@luisfernandez:** when i turned off the kafka app particularly for p99 it
went up  
**@walterddr:** ok so that value is measured by ```System.currentTimeMillis()
- minConsumingFreshnessMs``` in the case your app was turned off. it is
basically a linear function of wall-time.  
**@walterddr:** (since your minConsumingFreshnessms is your last ingested
kafka msg timestamp)  
**@walterddr:** is it possible for you to test turning the app back on and how
fast this metrics restore to 0? liek mayank suggested?  
**@luisfernandez:** right the app is on already but that metric is not going
down  
**@luisfernandez:**  
**@luisfernandez:** that’s p99  
**@walterddr:** can you make a query?  
**@luisfernandez:** like MAX time or something like that on the table?  
**@walterddr:** just count* is fine  
**@luisfernandez:** yea i can  
**@walterddr:** and check if the metrics comes down  
**@luisfernandez:** they don’t come down  
**@luisfernandez:** but i’m curious how does it relate  
**@walterddr:** upon checking the code path that metrics only update when a
query is being processed.  
**@luisfernandez:** ohh, this cluster is actively getting some queries  
**@walterddr:** yeah but is that specific for that table?  
**@luisfernandez:** yes specific for this table  
**@luisfernandez:** the metric too  
**@walterddr:** oh. interesting!  
 **@adam.hutson:** @adam.hutson has joined the channel  
 **@luisfernandez:** hey friends it’s me again, I was using apache  to do a
simple load test to the brokers in pinot, we noticed that the exceptions in
the server sky rocketed while ab was going, it seems like this is the stack
trace ```Encountered exception while processing requestId 9610 from broker
Broker_pinot-broker-1.pinot-broker-headless.pinot.svc.cluster.local_8099
java.lang.NullPointerException: null at
org.apache.pinot.core.util.trace.TraceContext.getTraceInfo(TraceContext.java:191)
~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependenci
es.jar:0.10.0-SNAPSHOT-b7c181a77289fccb10cea139a097efb5d82f634a] at
org.apache.pinot.core.query.executor.ServerQueryExecutorV1Impl.processQuery(ServerQueryExecutorV1Impl.java:223)
~[pinot-all-0.10. 0-SNAPSHOT-jar-with-
dependencies.jar:0.10.0-SNAPSHOT-b7c181a77289fccb10cea139a097efb5d82f634a] at
org.apache.pinot.core.query.executor.QueryExecutor.processQuery(QueryExecutor.java:60)
~[pinot-all-0.10.0-SNAPSHOT-jar-with-depen
dencies.jar:0.10.0-SNAPSHOT-b7c181a77289fccb10cea139a097efb5d82f634a] at
org.apache.pinot.core.query.scheduler.QueryScheduler.processQueryAndSerialize(QueryScheduler.java:151)
~[pinot-all-0.10.0-SNAPSHO T-jar-with-
dependencies.jar:0.10.0-SNAPSHOT-b7c181a77289fccb10cea139a097efb5d82f634a] at
org.apache.pinot.core.query.scheduler.QueryScheduler.lambda$createQueryFutureTask$0(QueryScheduler.java:137)
~[pinot-all-0.10.0-S NAPSHOT-jar-with-
dependencies.jar:0.10.0-SNAPSHOT-b7c181a77289fccb10cea139a097efb5d82f634a] at
java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?] at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?]
at
shaded.com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListe
nableFutureTask.java:111) [pinot-all-0.10.0-SNAPSHOT-jar-with-
dependencies.jar:0.10.0-SNAPSHOT-b7c181a77289fccb10cea139a097efb5d82f634a] at
shaded.com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:58)
[pinot-all-0.10.0-SNAPSHOT-jar-with-dep
endencies.jar:0.10.0-SNAPSHOT-b7c181a77289fccb10cea139a097efb5d82f634a] at
shaded.com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:75)
[pinot-all-0.10.0-S NAPSHOT-jar-with-
dependencies.jar:0.10.0-SNAPSHOT-b7c181a77289fccb10cea139a097efb5d82f634a] at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
[?:?] at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
[?:?] at java.lang.Thread.run(Thread.java:829) [?:?]``` does anyone know what
this NullPointer may refer to?  
**@luisfernandez:** ```java.lang.NullPointerException: null``` in some places
not even stack trace  
**@mayanks:** Was this a trace context bug recently fixed @richard892?  
 **@weixiang.sun:** In upsert table, can we update the timestamp of the row?  

###  _#pinot-dev_

  
 **@ashish:** Hi, in the GroupByKeyGenerator, under some circumstances, raw
keys are used (by invoking getInternal method on the dictionary) instead of
dictId. This does not go through the datafetcher and hence causes increased
latency. Is there a way around this?  

###  _#getting-started_

  
 **@ysuo:** @ysuo has joined the channel  
 **@puneet.singh:** @puneet.singh has joined the channel  
 **@sdg.deep:** @sdg.deep has joined the channel  
 **@adam.hutson:** @adam.hutson has joined the channel  
\--------------------------------------------------------------------- To
unsubscribe, e-mail: dev-unsubscribe@pinot.apache.org For additional commands,
e-mail: dev-help@pinot.apache.org