You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pinot.apache.org by Pinot Slack Email Digest <ap...@gmail.com> on 2021/11/16 02:00:21 UTC
Apache Pinot Daily Email Digest (2021-11-15)

### _#general_

  
 **@kautsshukla:** Hi All,  : “_error”: “Permission is denied for access type
‘READ’ to the endpoint “. It’s happening while i’m trying to connect through
pinot-jdbc 0.8.0 version client using user pwd.  
 **@julien.picard:** @julien.picard has joined the channel  
 **@ken:** Not sure what channel is best for this, but I was looking into
Pinot-related issues (from other projects) on the Apache Jira site, and
noticed that there’s a Pinot project with a handful of old issues. See  
**@ken:** Wondering if it’s possible to add a single open issue that says
“file issues using GitHub at xxx”, and migrate/remove all the other issues,
and prevent new issues from being created?  
**@ken:** Right now if someone stumbles on this, it looks like Pinot is a dead
project :disappointed:  
**@mayanks:** Thanks @ken, agree it would be a good idea to point folks to GH
issues.  
**@ken:** I don’t have sufficient Jira-fu to do the above, but it should be
pretty easy (other than determining, for each real issue that’s still open,
whether it’s been replicated to GitHub)  
 **@shantanoo.sinha:** @shantanoo.sinha has joined the channel  

###  _#random_

  
 **@julien.picard:** @julien.picard has joined the channel  
 **@shantanoo.sinha:** @shantanoo.sinha has joined the channel  

###  _#feat-upsert_

  
 **@elon.azoulay:** @elon.azoulay has joined the channel  

###  _#pinot-helix_

  
 **@elon.azoulay:** @elon.azoulay has joined the channel  

###  _#group-by-refactor_

  
 **@elon.azoulay:** @elon.azoulay has joined the channel  

###  _#inconsistent-segment_

  
 **@elon.azoulay:** @elon.azoulay has joined the channel  

###  _#minion-star-tree_

  
 **@elon.azoulay:** @elon.azoulay has joined the channel  

###  _#troubleshooting_

  
 **@yash.agarwal:** We have a pinot cluster, some of our users are running
very heavy queries which results in ```java.lang.OutOfMemoryError: Java heap
space``` This is fine, but as the result of this the server instance is
becoming unhealthy. i.e. Live Instance Config becomes ```{ "_code": 404,
"_error": "ZKPath /PinotCluster/LIVEINSTANCES/Server_node_8098 does not
exist:" }``` How can we solve the same ?  
**@g.kishore:** you can set a limit for maxQueryLimit and maxGroupBy limits  
**@yash.agarwal:** Sure. but even then there are cases when the limits are
quite small but it is doing a count distinct on a large column.  
**@yash.agarwal:** ideally we are preventing all these queries .. in our
middle layer .. and converting them to optimized versions .. but just in case
we want to avoid our nodes from going down.  
**@g.kishore:** how large is it?  
**@g.kishore:** did you try partitionedDistinct?  
**@yash.agarwal:** Yes we have that implemented .. but these are very fringe
cases hence trying to understand how to avoid from getting the node down.  
**@yash.agarwal:** Our memory settings are Xms4G Xmx8G on 16 G nodes. Should
we bump down our Xmx even further ?  
**@g.kishore:** Two options • increase the memory to ensure that distinct
values fit in memory add configuration to limit max distinct values or Enhance
distinct operator to start using HLL when the number of unique goes beyond a
certain size.. this will require code change  
**@yash.agarwal:** I am more worried about making sure the node is able to fix
itself after such a query.  
**@yash.agarwal:** Currently the only option for us is to restart the server
instance  
**@g.kishore:** Can you please file an issue  
**@yash.agarwal:** I think there is a similar issue already created. . Hence
not creating a duplicate.  
 **@alihaydar.atil:** Hello everyone, I am using version 0.8.0. When i run the
RealtimeProvisioningHelper command below, it gives me an exception. Any idea
why it happens? I have put one realtime table segment in
sampleCompletedSegmentDir directory. Command: ```root@pinot-
controller-0:/opt/pinot# bin/pinot-admin.sh RealtimeProvisioningHelper
-tableConfigFile /opt/pinot/denizTableConfig.json -numPartitions 1 -numHosts 2
-numHours 6,12,18,24 -sampleCompletedSegmentDir
/opt/pinot/samplesegment/realtime/ -ingestionRate 100``` Exception:
```Executing command: RealtimeProvisioningHelper -tableConfigFile
/opt/pinot/denizTableConfig.json -numPartitions 1 -pushFrequency null
-numHosts 2 -numHours 6,12,18,24 -sampleCompletedSegmentDir
/opt/pinot/samplesegment/realtime/ -ingestionRate 100 -maxUsableHostMemory 48G
-retentionHours 0 Exception caught: java.lang.RuntimeException: Caught
exception when reading segment index dir at
org.apache.pinot.controller.recommender.realtime.provisioning.MemoryEstimator.<init>(MemoryEstimator.java:117)
~[pinot-all-0.9.0-SNAPSHOT-jar-with-
dependencies.jar:0.9.0-SNAPSHOT-517a0dcea48a7dcb8616addc403c20e0fc23484a] at
org.apache.pinot.tools.admin.command.RealtimeProvisioningHelperCommand.execute(RealtimeProvisioningHelperCommand.java:268)
~[pinot-all-0.9.0-SNAPSHOT-jar-with-
dependencies.jar:0.9.0-SNAPSHOT-517a0dcea48a7dcb8616addc403c20e0fc23484a] at
org.apache.pinot.tools.admin.PinotAdministrator.execute(PinotAdministrator.java:169)
[pinot-all-0.9.0-SNAPSHOT-jar-with-
dependencies.jar:0.9.0-SNAPSHOT-517a0dcea48a7dcb8616addc403c20e0fc23484a] at
org.apache.pinot.tools.admin.PinotAdministrator.main(PinotAdministrator.java:189)
[pinot-all-0.9.0-SNAPSHOT-jar-with-
dependencies.jar:0.9.0-SNAPSHOT-517a0dcea48a7dcb8616addc403c20e0fc23484a]
Caused by: java.lang.NullPointerException: Cannot find segment metadata file
under directory: /opt/pinot/samplesegment/realtime at
shaded.com.google.common.base.Preconditions.checkNotNull(Preconditions.java:864)
~[pinot-all-0.9.0-SNAPSHOT-jar-with-
dependencies.jar:0.9.0-SNAPSHOT-517a0dcea48a7dcb8616addc403c20e0fc23484a] at
org.apache.pinot.segment.spi.index.metadata.SegmentMetadataImpl.getPropertiesConfiguration(SegmentMetadataImpl.java:144)
~[pinot-all-0.9.0-SNAPSHOT-jar-with-
dependencies.jar:0.9.0-SNAPSHOT-517a0dcea48a7dcb8616addc403c20e0fc23484a] at
org.apache.pinot.segment.spi.index.metadata.SegmentMetadataImpl.<init>(SegmentMetadataImpl.java:117)
~[pinot-all-0.9.0-SNAPSHOT-jar-with-
dependencies.jar:0.9.0-SNAPSHOT-517a0dcea48a7dcb8616addc403c20e0fc23484a] at
org.apache.pinot.controller.recommender.realtime.provisioning.MemoryEstimator.<init>(MemoryEstimator.java:115)
~[pinot-all-0.9.0-SNAPSHOT-jar-with-
dependencies.jar:0.9.0-SNAPSHOT-517a0dcea48a7dcb8616addc403c20e0fc23484a] ...
3 more``` realtime table config file [-tableConfigFile
/opt/pinot/denizTableConfig.json] ```{ "tableName": "denizhybrid",
"tableType": "REALTIME", "segmentsConfig": { "timeColumnName": "messageTime",
"timeType": "MILLISECONDS", "schemaName": "deniz", "replicasPerPartition":
"1", "retentionTimeUnit":"DAYS", "retentionTimeValue":"2" }, "tenants": {},
"fieldConfigList": [ { "name": "location_st_point", "encodingType":"RAW",
"indexType":"H3", "properties": { "resolutions": "5" } } ],
"tableIndexConfig": { "loadMode": "MMAP", "rangeIndexColumns": [ "latitude",
"longitude" ], "noDictionaryColumns": [ "location_st_point" ],
"streamConfigs": { "streamType": "kafka", "stream.kafka.consumer.type":
"lowlevel", "stream.kafka.topic.name": "kafkadeniztest2",
"stream.kafka.decoder.class.name":
"org.apache.pinot.plugin.stream.kafka.KafkaJSONMessageDecoder",
"stream.kafka.consumer.factory.class.name":
"org.apache.pinot.plugin.stream.kafka20.KafkaConsumerFactory",
"stream.kafka.broker.list": "kafka:9092",
"realtime.segment.flush.threshold.size": "0",
"realtime.segment.flush.threshold.time": "24h",
"realtime.segment.flush.desired.size": "50M",
"stream.kafka.consumer.prop.auto.offset.reset": "smallest" } }, "query": {
"timeoutMs": 60000 }, "metadata": { "customConfigs": {} }, "task": {
"taskTypeConfigsMap": { "RealtimeToOfflineSegmentsTask": {
"bucketTimePeriod":"6h", "bufferTimePeriod":"9h",
"maxNumRecordsPerSegment":"1000000" } } } }``` Thanks in Advance.  
**@mayanks:** Can you list files inside of segment dir that you provided?  
**@alihaydar.atil:** there is only one segment file inside
-sampleCompletedSegmentDir which is named denizhybrid__0__23__20211114T2333Z.
it is around 30MB I also have tried putting only one offline segment file
named denizhybrid_1636480800506_1636502392443_0 in that folder but got the
same exception  
 **@kchavda:** Hi All, had a few questions about using `Pinot managed offline
flows`. Any help would be greatly appreciated! 1\. Does the OFFLINE table
config need to have the `RealtimeToOfflineSegmentsTask` match the one added to
the REALTIME table config? 2\. I'm seeing this `TASK_ERROR to DROPPED` in the
minion log. What does this signify? ```20 START:INVOKE
/PinotCluster/INSTANCES/Minion_172.19.0.6_9514/MESSAGES
listener:org.apache.helix.messaging.handling.HelixTaskExecutor@157c6932 type:
CALLBACK Resubscribe change listener to path:
/PinotCluster/INSTANCES/Minion_172.19.0.6_9514/MESSAGES, for listener:
org.apache.helix.messaging.handling.HelixTaskExecutor@157c6932, watchChild:
false Subscribing changes listener to path:
/PinotCluster/INSTANCES/Minion_172.19.0.6_9514/MESSAGES, type: CALLBACK,
listener: org.apache.helix.messaging.handling.HelixTaskExecutor@157c6932
Subscribing child change listener to
path:/PinotCluster/INSTANCES/Minion_172.19.0.6_9514/MESSAGES Subscribing to
path:/PinotCluster/INSTANCES/Minion_172.19.0.6_9514/MESSAGES took:0 The
latency of message 6a8ac921-3913-43e8-a777-b15c16185245 is 7 ms Scheduling
message 6a8ac921-3913-43e8-a777-b15c16185245:
TaskQueue_RealtimeToOfflineSegmentsTask_Task_RealtimeToOfflineSegmentsTask_1636993325945:TaskQueue_RealtimeToOfflineSegmentsTask_Task_RealtimeToOfflineSegmentsTask_1636993325945_0,
TASK_ERROR->DROPPED Submit task: 6a8ac921-3913-43e8-a777-b15c16185245 to pool:
java.util.concurrent.ThreadPoolExecutor@67024f54[Running, pool size = 40,
active threads = 0, queued tasks = 0, completed tasks = 221] Message:
6a8ac921-3913-43e8-a777-b15c16185245 handling task scheduled 20 END:INVOKE
/PinotCluster/INSTANCES/Minion_172.19.0.6_9514/MESSAGES
listener:org.apache.helix.messaging.handling.HelixTaskExecutor@157c6932 type:
CALLBACK Took: 8ms handling task: 6a8ac921-3913-43e8-a777-b15c16185245 begin,
at: 1636993355435 handling message: 6a8ac921-3913-43e8-a777-b15c16185245
transit
TaskQueue_RealtimeToOfflineSegmentsTask_Task_RealtimeToOfflineSegmentsTask_1636993325945.TaskQueue_RealtimeToOfflineSegmentsTask_Task_RealtimeToOfflineSegmentsTask_1636993325945_0|[]
from:TASK_ERROR to:DROPPED, relayedFrom: null Merging with delta list,
recordId =
TaskQueue_RealtimeToOfflineSegmentsTask_Task_RealtimeToOfflineSegmentsTask_1636993325945
other:TaskQueue_RealtimeToOfflineSegmentsTask_Task_RealtimeToOfflineSegmentsTask_1636993325945
Instance Minion_172.19.0.6_9514, partition
TaskQueue_RealtimeToOfflineSegmentsTask_Task_RealtimeToOfflineSegmentsTask_1636993325945_0
received state transition from TASK_ERROR to DROPPED on session
1005c465f540008, message id: 6a8ac921-3913-43e8-a777-b15c16185245 Merging with
delta list, recordId =
TaskQueue_RealtimeToOfflineSegmentsTask_Task_RealtimeToOfflineSegmentsTask_1636993325945
other:TaskQueue_RealtimeToOfflineSegmentsTask_Task_RealtimeToOfflineSegmentsTask_1636993325945
Removed
/PinotCluster/INSTANCES/Minion_172.19.0.6_9514/CURRENTSTATES/1005c465f540008/TaskQueue_RealtimeToOfflineSegmentsTask_Task_RealtimeToOfflineSegmentsTask_1636993325945
Message 6a8ac921-3913-43e8-a777-b15c16185245 completed. Delete message
6a8ac921-3913-43e8-a777-b15c16185245 from zk! message finished:
6a8ac921-3913-43e8-a777-b15c16185245, took 14 Message:
6a8ac921-3913-43e8-a777-b15c16185245 (parent: null) handling task for
TaskQueue_RealtimeToOfflineSegmentsTask_Task_RealtimeToOfflineSegmentsTask_1636993325945:TaskQueue_RealtimeToOfflineSegmentsTask_Task_RealtimeToOfflineSegmentsTask_1636993325945_0
completed at: 1636993355449, results: true. FrameworkTime: 1 ms; HandlerTime:
13 ms. Subscribing changes listener to path:
/PinotCluster/INSTANCES/Minion_172.19.0.6_9514/MESSAGES, type: CALLBACK,
listener: org.apache.helix.messaging.handling.HelixTaskExecutor@157c6932
Subscribing child change listener to
path:/PinotCluster/INSTANCES/Minion_172.19.0.6_9514/MESSAGES Subscribing to
path:/PinotCluster/INSTANCES/Minion_172.19.0.6_9514/MESSAGES took:0``` 3\. The
tasks/scheduler/information API endpoint returns "Task scheduler is disabled".
I've added entry to controller config `"controller.task.frequencyInSeconds":
3600` is there some other setting I need to configure? 4\. The
tasks/task/taskname/state is giving a `500 Index 1 out of bounds for length
1"` but tasks/tasktype/taskstates shows completed. I'm not seeing any segments
added to my OFFLINE table though. Any idea on what's missing?  
**@npawar:** 1\. No need to set anything in offline table 2\. Looks like the
task had some exceptions, there should be some more logs about why the task
failed and went into TASK_ERROR state (and then from TASK_ERROR to DROPPED).
Any exception/error logs from before what you’ve pasted? 3\. Not sure why is
says disabled. As long as you’re seeing controller create tasks from the logs,
and monion pick up the tasks, you’re good. If you’re not seeing that, try
adding this `controller.task.scheduler.enabled: true` to the controller config
4\. lets see more logs from controller/minion?  
**@kchavda:** 1 :white_check_mark: 2 - I didn't go far back on the log. There
is exception an exception ```Caught exception while fetching segment from:
to:
/tmp/PinotMinion/data/RealtimeToOfflineSegmentsTask/tmp-106dfc56-8986-48a1-98cb-c97c7b2bc767/tarredSegmentFile_0
java.lang.IllegalStateException: PinotFS for scheme: s3 has not been
initialized``` I am using S3 for deep storage. I do see segments being written
there. I'm guessing I need to pass in `env` values for access key and secret
key? 3/4 - Maybe fixing the above will fix these? @npawar  
**@npawar:** ah.. you need to add deep store properties to your minion
components. you must’ve added some deep store configs to controller/server?  
**@kchavda:** Yes, I added the configs to controller/server and the segments
are being written to S3. Are minion specific configs documented?  
**@npawar:** no.. let me add it  
**@npawar:** would love some detailed feedback about the docs from you after
this :stuck_out_tongue:  
**@kchavda:** Great! Thank you Neha!  
**@kchavda:** For sure :slightly_smiling_face:  
**@kchavda:** I'm about to watch your presentation on this topic from back in
July!  
**@npawar:**  
**@npawar:** added for all the FS. it’s the same as server/controller, except
the prefix is shorter  
**@kchavda:** Great! Thank you!  
**@kchavda:** In comparing the config for controller, my working version I had
to add the following: ```controller.helix.cluster.name=PinotCluster
controller.zk.str=pinot-zookeeper:2181 controller.host=
controller.port=9000``` Which is referenced in the tutorial  
**@kchavda:** For server.conf the following ( present in tutorial)
```pinot.server.netty.port=8098 pinot.server.adminapi.port=8097
pinot.server.instance.dataDir=/tmp/pinot-tmp/server/index
pinot.server.instance.segmentTarDir=/tmp/pinot-tmp/server/segmentTars```  
**@kchavda:** So restarted Minion and that resolved the S3 error  
**@kchavda:** However seeing errors on controller/server and OFFLINE table
being in bad status.  
**@kchavda:** Server log excerpt ```2021/11/15 20:59:35.301 ERROR
[SegmentFetcherAndLoader] [HelixTaskExecutor-message_handle_thread] Attempts
exceeded when downloading segment:
consolidations_1550859587018_1550861765735_0 for table: consolidations_OFFLINE
from:  to: /tmp/PinotServer/segmentTar/consolidations_OFFLINE/tmp-
consolidations_1550859587018_1550861765735_0-fb7cfa39-13a5-4f3e-813b-f4e36a505290/consolidations_1550859587018_1550861765735_0.tar.gz
2021/11/15 20:59:35.302 ERROR [SegmentFetcherAndLoader] [HelixTaskExecutor-
message_handle_thread] Cannot load segment :
consolidations_1550859587018_1550861765735_0 for table consolidations_OFFLINE
org.apache.pinot.spi.utils.retry.AttemptsExceededException: Operation failed
after 3 attempts at
org.apache.pinot.spi.utils.retry.BaseRetryPolicy.attempt(BaseRetryPolicy.java:61)
~[pinot-all-0.8.0-jar-with-
dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at
org.apache.pinot.common.utils.fetcher.BaseSegmentFetcher.fetchSegmentToLocal(BaseSegmentFetcher.java:72)
~[pinot-all-0.8.0-jar-with-
dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at
org.apache.pinot.common.utils.fetcher.SegmentFetcherFactory.fetchSegmentToLocalInternal(SegmentFetcherFactory.java:146)
~[pinot-all-0.8.0-jar-with-
dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at
org.apache.pinot.common.utils.fetcher.SegmentFetcherFactory.fetchSegmentToLocal(SegmentFetcherFactory.java:141)
~[pinot-all-0.8.0-jar-with-
dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at
org.apache.pinot.server.starter.helix.SegmentFetcherAndLoader.downloadSegmentToLocal(SegmentFetcherAndLoader.java:198)
~[pinot-all-0.8.0-jar-with-
dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at
org.apache.pinot.server.starter.helix.SegmentFetcherAndLoader.addOrReplaceOfflineSegment(SegmentFetcherAndLoader.java:154)
[pinot-all-0.8.0-jar-with-
dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at
org.apache.pinot.server.starter.helix.SegmentOnlineOfflineStateModelFactory$SegmentOnlineOfflineStateModel.onBecomeOnlineFromOffline(SegmentOnlineOfflineStateModelFactory.java:166)
[pinot-all-0.8.0-jar-with-
dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at
jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:?] at
jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
~[?:?] at
jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
~[?:?] at java.lang.reflect.Method.invoke(Method.java:566) ~[?:?] at
org.apache.helix.messaging.handling.HelixStateTransitionHandler.invoke(HelixStateTransitionHandler.java:404)
[pinot-all-0.8.0-jar-with-
dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at
org.apache.helix.messaging.handling.HelixStateTransitionHandler.handleMessage(HelixStateTransitionHandler.java:331)
[pinot-all-0.8.0-jar-with-
dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at
org.apache.helix.messaging.handling.HelixTask.call(HelixTask.java:97) [pinot-
all-0.8.0-jar-with-
dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at
org.apache.helix.messaging.handling.HelixTask.call(HelixTask.java:49) [pinot-
all-0.8.0-jar-with-
dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at
java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?] at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
[?:?] at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
[?:?] at java.lang.Thread.run(Thread.java:829) [?:?] 2021/11/15 20:59:35.302
ERROR [SegmentOnlineOfflineStateModelFactory$SegmentOnlineOfflineStateModel]
[HelixTaskExecutor-message_handle_thread] Caught exception in state transition
from OFFLINE -> ONLINE for resource: consolidations_OFFLINE, partition:
consolidations_1550859587018_1550861765735_0
org.apache.pinot.spi.utils.retry.AttemptsExceededException: Operation failed
after 3 attempts at
org.apache.pinot.spi.utils.retry.BaseRetryPolicy.attempt(BaseRetryPolicy.java:61)
~[pinot-all-0.8.0-jar-with-
dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at
org.apache.pinot.common.utils.fetcher.BaseSegmentFetcher.fetchSegmentToLocal(BaseSegmentFetcher.java:72)
~[pinot-all-0.8.0-jar-with-
dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at
org.apache.pinot.common.utils.fetcher.SegmentFetcherFactory.fetchSegmentToLocalInternal(SegmentFetcherFactory.java:146)
~[pinot-all-0.8.0-jar-with-
dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at
org.apache.pinot.common.utils.fetcher.SegmentFetcherFactory.fetchSegmentToLocal(SegmentFetcherFactory.java:141)
~[pinot-all-0.8.0-jar-with-
dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at
org.apache.pinot.server.starter.helix.SegmentFetcherAndLoader.downloadSegmentToLocal(SegmentFetcherAndLoader.java:198)
~[pinot-all-0.8.0-jar-with-
dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at
org.apache.pinot.server.starter.helix.SegmentFetcherAndLoader.addOrReplaceOfflineSegment(SegmentFetcherAndLoader.java:154)
~[pinot-all-0.8.0-jar-with-
dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at
org.apache.pinot.server.starter.helix.SegmentOnlineOfflineStateModelFactory$SegmentOnlineOfflineStateModel.onBecomeOnlineFromOffline(SegmentOnlineOfflineStateModelFactory.java:166)
[pinot-all-0.8.0-jar-with-
dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at
jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:?] at
jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
~[?:?] at
jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
~[?:?] at java.lang.reflect.Method.invoke(Method.java:566) ~[?:?] at
org.apache.helix.messaging.handling.HelixStateTransitionHandler.invoke(HelixStateTransitionHandler.java:404)
[pinot-all-0.8.0-jar-with-
dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at
org.apache.helix.messaging.handling.HelixStateTransitionHandler.handleMessage(HelixStateTransitionHandler.java:331)
[pinot-all-0.8.0-jar-with-
dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at
org.apache.helix.messaging.handling.HelixTask.call(HelixTask.java:97) [pinot-
all-0.8.0-jar-with-
dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at
org.apache.helix.messaging.handling.HelixTask.call(HelixTask.java:49) [pinot-
all-0.8.0-jar-with-
dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at
java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?] at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
[?:?] at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
[?:?] at java.lang.Thread.run(Thread.java:829) [?:?] 2021/11/15 20:59:35.303
ERROR [HelixStateTransitionHandler] [HelixTaskExecutor-message_handle_thread]
Exception while executing a state transition task
consolidations_1550859587018_1550861765735_0
java.lang.reflect.InvocationTargetException: null at
jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:?] at
jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
~[?:?] at
jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
~[?:?] at java.lang.reflect.Method.invoke(Method.java:566) ~[?:?] at
org.apache.helix.messaging.handling.HelixStateTransitionHandler.invoke(HelixStateTransitionHandler.java:404)
~[pinot-all-0.8.0-jar-with-
dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at
org.apache.helix.messaging.handling.HelixStateTransitionHandler.handleMessage(HelixStateTransitionHandler.java:331)
[pinot-all-0.8.0-jar-with-
dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at
org.apache.helix.messaging.handling.HelixTask.call(HelixTask.java:97) [pinot-
all-0.8.0-jar-with-
dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at
org.apache.helix.messaging.handling.HelixTask.call(HelixTask.java:49) [pinot-
all-0.8.0-jar-with-
dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at
java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?] at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
[?:?] at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
[?:?] at java.lang.Thread.run(Thread.java:829) [?:?] Caused by:
org.apache.pinot.spi.utils.retry.AttemptsExceededException: Operation failed
after 3 attempts at
org.apache.pinot.spi.utils.retry.BaseRetryPolicy.attempt(BaseRetryPolicy.java:61)
~[pinot-all-0.8.0-jar-with-
dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at
org.apache.pinot.common.utils.fetcher.BaseSegmentFetcher.fetchSegmentToLocal(BaseSegmentFetcher.java:72)
~[pinot-all-0.8.0-jar-with-
dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at
org.apache.pinot.common.utils.fetcher.SegmentFetcherFactory.fetchSegmentToLocalInternal(SegmentFetcherFactory.java:146)
~[pinot-all-0.8.0-jar-with-
dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at
org.apache.pinot.common.utils.fetcher.SegmentFetcherFactory.fetchSegmentToLocal(SegmentFetcherFactory.java:141)
~[pinot-all-0.8.0-jar-with-
dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at
org.apache.pinot.server.starter.helix.SegmentFetcherAndLoader.downloadSegmentToLocal(SegmentFetcherAndLoader.java:198)
~[pinot-all-0.8.0-jar-with-
dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at
org.apache.pinot.server.starter.helix.SegmentFetcherAndLoader.addOrReplaceOfflineSegment(SegmentFetcherAndLoader.java:154)
~[pinot-all-0.8.0-jar-with-
dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at
org.apache.pinot.server.starter.helix.SegmentOnlineOfflineStateModelFactory$SegmentOnlineOfflineStateModel.onBecomeOnlineFromOffline(SegmentOnlineOfflineStateModelFactory.java:166)
~[pinot-all-0.8.0-jar-with-
dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] ... 12 more
2021/11/15 20:59:35.312 ERROR [StateModel] [HelixTaskExecutor-
message_handle_thread] Default rollback method invoked on error. Error Code:
ERROR``` I see a new segment on S3 though.  
**@kchavda:** controller error: ```2021/11/15 20:59:27.429 ERROR
[MessageGenerationPhase] [HelixController-pipeline-default-
PinotCluster-(1ee314dc_DEFAULT)] Event 1ee314dc_DEFAULT : Unable to find a
next state for resource: profiles_OFFLINE partition:
profiles_1413387486771_1413405745431_0 from stateModelDefinitionclass
org.apache.helix.model.StateModelDefinition from:ERROR to:ONLINE 2021/11/15
20:59:27.448 ERROR [MessageGenerationPhase] [HelixController-pipeline-default-
PinotCluster-(9f672776_DEFAULT)] Event 9f672776_DEFAULT : Unable to find a
next state for resource: profiles_OFFLINE partition:
profiles_1413387486771_1413405745431_0 from stateModelDefinitionclass
org.apache.helix.model.StateModelDefinition from:ERROR to:ONLINE 2021/11/15
20:59:35.340 ERROR [MessageGenerationPhase] [HelixController-pipeline-default-
PinotCluster-(2bdf94fa_DEFAULT)] Event 2bdf94fa_DEFAULT : Unable to find a
next state for resource: consolidations_OFFLINE partition:
consolidations_1550859587018_1550861765735_0 from stateModelDefinitionclass
org.apache.helix.model.StateModelDefinition from:ERROR to:ONLINE 2021/11/15
20:59:35.340 ERROR [MessageGenerationPhase] [HelixController-pipeline-default-
PinotCluster-(2bdf94fa_DEFAULT)] Event 2bdf94fa_DEFAULT : Unable to find a
next state for resource: profiles_OFFLINE partition:
profiles_1413387486771_1413405745431_0 from stateModelDefinitionclass
org.apache.helix.model.StateModelDefinition from:ERROR to:ONLINE 2021/11/15
20:59:35.362 ERROR [MessageGenerationPhase] [HelixController-pipeline-default-
PinotCluster-(a26a1dc5_DEFAULT)] Event a26a1dc5_DEFAULT : Unable to find a
next state for resource: consolidations_OFFLINE partition:
consolidations_1550859587018_1550861765735_0 from stateModelDefinitionclass
org.apache.helix.model.StateModelDefinition from:ERROR to:ONLINE 2021/11/15
20:59:35.363 ERROR [MessageGenerationPhase] [HelixController-pipeline-default-
PinotCluster-(a26a1dc5_DEFAULT)] Event a26a1dc5_DEFAULT : Unable to find a
next state for resource: profiles_OFFLINE partition:
profiles_1413387486771_1413405745431_0 from stateModelDefinitionclass
org.apache.helix.model.StateModelDefinition from:ERROR to:ONLINE 2021/11/15
20:59:35.378 ERROR [MessageGenerationPhase] [HelixController-pipeline-default-
PinotCluster-(76e13678_DEFAULT)] Event 76e13678_DEFAULT : Unable to find a
next state for resource: consolidations_OFFLINE partition:
consolidations_1550859587018_1550861765735_0 from stateModelDefinitionclass
org.apache.helix.model.StateModelDefinition from:ERROR to:ONLINE 2021/11/15
20:59:35.378 ERROR [MessageGenerationPhase] [HelixController-pipeline-default-
PinotCluster-(76e13678_DEFAULT)] Event 76e13678_DEFAULT : Unable to find a
next state for resource: profiles_OFFLINE partition:
profiles_1413387486771_1413405745431_0 from stateModelDefinitionclass
org.apache.helix.model.StateModelDefinition from:ERROR to:ONLINE 2021/11/15
20:59:55.384 ERROR [CompletionServiceHelper] [grizzly-http-server-15] Server:
Server_172.19.0.5_8098 returned error: 404 2021/11/15 20:59:55.432 ERROR
[CompletionServiceHelper] [grizzly-http-server-4] Server:
Server_172.19.0.5_8098 returned error: 404 2021/11/15 21:00:04.680 ERROR
[CompletionServiceHelper] [grizzly-http-server-7] Server:
Server_172.19.0.5_8098 returned error: 404 2021/11/15 21:00:04.730 ERROR
[CompletionServiceHelper] [grizzly-http-server-0] Server:
Server_172.19.0.5_8098 returned error: 404 2021/11/15 21:00:07.529 ERROR
[CompletionServiceHelper] [grizzly-http-server-6] Server:
Server_172.19.0.5_8098 returned error: 404 2021/11/15 21:00:07.578 ERROR
[CompletionServiceHelper] [grizzly-http-server-7] Server:
Server_172.19.0.5_8098 returned error: 404 2021/11/15 21:01:43.084 ERROR
[CompletionServiceHelper] [grizzly-http-server-3] Server:
Server_172.19.0.5_8098 returned error: 404 2021/11/15 21:01:43.179 ERROR
[CompletionServiceHelper] [grizzly-http-server-6] Server:
Server_172.19.0.5_8098 returned error: 404 2021/11/15 21:01:46.233 ERROR
[CompletionServiceHelper] [grizzly-http-server-11] Server:
Server_172.19.0.5_8098 returned error: 404 2021/11/15 21:01:46.284 ERROR
[CompletionServiceHelper] [grizzly-http-server-12] Server:
Server_172.19.0.5_8098 returned error: 404 2021/11/15 21:01:52.885 ERROR
[CompletionServiceHelper] [grizzly-http-server-6] Server:
Server_172.19.0.5_8098 returned error: 404 2021/11/15 21:02:14.731 ERROR
[CompletionServiceHelper] [grizzly-http-server-11] Server:
Server_172.19.0.5_8098 returned error: 404 2021/11/15 21:02:17.036 ERROR
[CompletionServiceHelper] [grizzly-http-server-11] Server:
Server_172.19.0.5_8098 returned error: 404 2021/11/15 21:02:17.081 ERROR
[CompletionServiceHelper] [grizzly-http-server-15] Server:
Server_172.19.0.5_8098 returned error: 404 2021/11/15 21:02:20.184 ERROR
[CompletionServiceHelper] [grizzly-http-server-4] Server:
Server_172.19.0.5_8098 returned error: 404 2021/11/15 21:02:20.276 ERROR
[CompletionServiceHelper] [grizzly-http-server-6] Server:
Server_172.19.0.5_8098 returned error: 404 2021/11/15 21:03:44.386 ERROR
[CompletionServiceHelper] [grizzly-http-server-15] Server:
Server_172.19.0.5_8098 returned error: 404 2021/11/15 21:03:44.438 ERROR
[CompletionServiceHelper] [grizzly-http-server-10] Server:
Server_172.19.0.5_8098 returned error: 404 2021/11/15 21:06:56.253 ERROR
[CompletionServiceHelper] [grizzly-http-server-12] Server:
Server_172.19.0.5_8098 returned error: 404 2021/11/15 21:06:56.287 ERROR
[CompletionServiceHelper] [grizzly-http-server-1] Server:
Server_172.19.0.5_8098 returned error: 404 2021/11/15 21:10:43.545 ERROR
[CompletionServiceHelper] [grizzly-http-server-9] Server:
Server_172.19.0.5_8098 returned error: 404 2021/11/15 21:10:43.575 ERROR
[CompletionServiceHelper] [grizzly-http-server-13] Server:
Server_172.19.0.5_8098 returned error: 404 2021/11/15 21:12:16.096 ERROR
[MessageGenerationPhase] [HelixController-pipeline-default-
PinotCluster-(d838849b_DEFAULT)] Event d838849b_DEFAULT : Unable to find a
next state for resource: consolidations_OFFLINE partition:
consolidations_1550859587018_1550861765735_0 from stateModelDefinitionclass
org.apache.helix.model.StateModelDefinition from:ERROR to:ONLINE 2021/11/15
21:12:16.097 ERROR [MessageGenerationPhase] [HelixController-pipeline-default-
PinotCluster-(d838849b_DEFAULT)] Event d838849b_DEFAULT : Unable to find a
next state for resource: profiles_OFFLINE partition:
profiles_1413387486771_1413405745431_0 from stateModelDefinitionclass
org.apache.helix.model.StateModelDefinition from:ERROR to:ONLINE```  
**@npawar:** were any changes made to the controller/server side deep store
configs?  
**@kchavda:** Nope. I just added config to minion and restarted the docker
container.  
**@tony:** Based on a thread from a few days ago, I changed our Pinot
deployment from 6 controllers to 3. Now I am seeing three controllers as
"dead" in Cluster Manager, and I am getting `segments ... unavailable` errors
(though I am not sure these two issues are related 1\. How do I get rid of
"dead" controllers when I reduce the number of controllers? 2\. Could this
cause `segment ... unavailable` ?  
**@mayanks:** For 2, could you share the output of debug api (from swagger)?  
**@tony:** For 2, the segments eventually changed to a good state - I think I
had not given the severs enough time to recover after restarting (for
something else). So (1) is not related to (2) - but I would still like to fix
(1)  
**@mayanks:** Can you paste the screenshot for 1)?  
**@mayanks:** Also, is it just the UI issue, or does ZK browser also show the
issue? If former can you help file a GH issue?  
**@tony:**  
**@tony:** Not sure where to look in ZK browser  
**@mayanks:** You can check if there's any reference of the removed
controllers in the `CONTROLLER`, `INSTANCES` or `LIVEINSTANCES` nodes. If not,
then this may be a UI issue.  
**@tony:** There are references in INSTANCES but not LIVEINSTANCES  
**@mayanks:** Thanks @tony, could you file a GH issue (with as much info as
you can in terms of symptoms and repro steps)?  
**@tony:** Will do  
 **@elon.azoulay:** Hi, observed that increasing the zk client timeout in the
pinot zookeeper does not prevent a zk client timeout from helix, which is
hardcoded. We see these errors when the brokers are under heavy gc pressure,
gc pauses, etc.  
**@elon.azoulay:** ```org.apache.helix.manager.zk public class ZkClient
extends org.apache.helix.manager.zk.zookeeper.ZkClient implements
HelixZkClient { ... public static final int DEFAULT_SESSION_TIMEOUT = 30 *
1000;```  
**@elon.azoulay:** @jxue @g.kishore  
**@elon.azoulay:** Would it make sense to make this configurable?  
**@elon.azoulay:** I do agree w @g.kishore’s concern that the longer the
timeout the longer an outage is not detected  
**@mayanks:** Not to side track the conversation @elon.azoulay, but are you
seeing 30s GCs?  
**@elon.azoulay:** Thanksfully we didn't but we did see a spike in gc
activity/duration  
**@elon.azoulay:** Still investigating, could be a readiness probe failing  
 **@julien.picard:** @julien.picard has joined the channel  
 **@docchial:** @docchial has joined the channel  
 **@sandeep.hadoopadmn:** Hi team, Can we join two tables and query?  
**@g.kishore:** depends on the join type, only lookup join is supported as of
now  
**@kchavda:**  
**@sandeep.hadoopadmn:** thank you  
 **@shantanoo.sinha:** @shantanoo.sinha has joined the channel  

###  _#pinot-k8s-operator_

  
 **@elon.azoulay:** @elon.azoulay has joined the channel  

###  _#multi-region-setup_

  
 **@elon.azoulay:** @elon.azoulay has joined the channel  

###  _#metadata-push-api_

  
 **@elon.azoulay:** @elon.azoulay has joined the channel  

###  _#pinot-perf-tuning_

  
 **@rohit:** @rohit has joined the channel  

###  _#thirdeye-pinot_

  
 **@shreya.chakraborty:** @shreya.chakraborty has joined the channel  
 **@shreya.chakraborty:** Hi everyone :wave: Whats the process for pushing a
new release to the repo? I dont see any release history/issues.  
 **@rohit:** @rohit has joined the channel  

###  _#getting-started_

  
 **@bagi.priyank:** hello. i started two pinot clusters with both of them
consuming from the same kafka cluster and same topic. one pinot cluster is
using inverted index on the same set of fields that the other one uses for
star-tree index. so basically two pinot tables where the only difference is
that first one uses inverted index while second one uses star-tree index. i
created tables at the same time so i am assuming that both start consuming
from the kafka topic at the same time. when i issue same query to both tables
one after another, i see that `totalDocs` is 2x/3x for table with inverted
index in comparison to table with star-tree index. if it matters, i started
querying tables after ~5-10 mins of creating them. i also confirmed this by
running ```select count(*) from <table_name>``` is this expected?  
 **@bagi.priyank:** i noticed that `group.id =` (basically empty) as so maybe
both pinot tables are using the same group id.  
 **@bagi.priyank:** i tried using ``` "streamConfigs": { "streamType":
"kafka", "stream.kafka.consumer.type": "lowLevel", "stream.kafka.topic.name":
<topic_name>, "stream.kafka.decoder.class.name":
"org.apache.pinot.plugin.inputformat.avro.SimpleAvroMessageDecoder",
"stream.kafka.consumer.factory.class.name":
"org.apache.pinot.plugin.stream.kafka20.KafkaConsumerFactory",
"stream.kafka.broker.list": <broker_list>,
"realtime.segment.flush.threshold.size": "0",
"realtime.segment.flush.threshold.time": "24h",
"realtime.segment.flush.desired.size": "50M",
"stream.kafka.consumer.prop.auto.offset.reset": "largest",
"stream.kafka.consumer.prop.group.id": <group_id>,
"stream.kafka.decoder.prop.schema": <schema> }``` and ``` "streamConfigs": {
"streamType": "kafka", "stream.kafka.consumer.type": "highLevel",
"stream.kafka.topic.name": <topic_name>, "stream.kafka.decoder.class.name":
"org.apache.pinot.plugin.inputformat.avro.SimpleAvroMessageDecoder",
"stream.kafka.consumer.factory.class.name":
"org.apache.pinot.plugin.stream.kafka20.KafkaConsumerFactory",
"stream.kafka.hlc.bootstrap.server": <broker_list>,
"realtime.segment.flush.threshold.size": "0",
"realtime.segment.flush.threshold.time": "24h",
"realtime.segment.flush.desired.size": "50M",
"stream.kafka.consumer.prop.auto.offset.reset": "largest",
"stream.kafka.consumer.prop.group.id": <group_id>,
"stream.kafka.decoder.prop.schema": <schema> }``` and ``` "streamConfigs": {
"streamType": "kafka", "stream.kafka.consumer.type": "highLevel",
"stream.kafka.topic.name": <topic_name>, "stream.kafka.decoder.class.name":
"org.apache.pinot.plugin.inputformat.avro.SimpleAvroMessageDecoder",
"stream.kafka.consumer.factory.class.name":
"org.apache.pinot.plugin.stream.kafka20.KafkaConsumerFactory",
"stream.kafka.hlc.bootstrap.server": <broker_list>,
"realtime.segment.flush.threshold.size": "0",
"realtime.segment.flush.threshold.time": "24h",
"realtime.segment.flush.desired.size": "50M",
"stream.kafka.consumer.prop.auto.offset.reset": "largest",
"stream.kafka.consumer.prop.hlc.group.id": <group_id>,
"stream.kafka.decoder.prop.schema": <schema> }``` and none of those worked.
finally after looking at code i tried ``` "streamConfigs": { "streamType":
"kafka", "stream.kafka.consumer.type": "lowLevel", "stream.kafka.topic.name":
<topic_name>, "stream.kafka.decoder.class.name":
"org.apache.pinot.plugin.inputformat.avro.SimpleAvroMessageDecoder",
"stream.kafka.consumer.factory.class.name":
"org.apache.pinot.plugin.stream.kafka20.KafkaConsumerFactory",
"stream.kafka.broker.list": <broker_list>,
"stream.kafka.consumer.prop.auto.offset.reset": "largest",
"stream.kafka.group.id": <group_id>, "stream.kafka.decoder.prop.schema":
<schema>, "realtime.segment.flush.threshold.size": "0",
"realtime.segment.flush.threshold.time": "24h",
"realtime.segment.flush.desired.size": "50M" },``` and that was able to
consume from kafka but i don't see it in the list of kafka consumer groups.
logs still say group.id is empty. any help / pointers are appreciated.  
 **@bagi.priyank:** also tried ``` "streamConfigs": { "streamType": "kafka",
"stream.kafka.consumer.type": "highLevel", "stream.kafka.topic.name":
<topic_name>, "stream.kafka.decoder.class.name":
"org.apache.pinot.plugin.inputformat.avro.SimpleAvroMessageDecoder",
"stream.kafka.consumer.factory.class.name":
"org.apache.pinot.plugin.stream.kafka20.KafkaConsumerFactory",
"stream.kafka.hlc.bootstrap.server": <broker_list>,
"stream.kafka.consumer.prop.auto.offset.reset": "smallest",
"stream.kafka.hlc.group.id": <group_id>, "stream.kafka.decoder.prop.schema":
<schema>, "realtime.segment.flush.threshold.size": "0",
"realtime.segment.flush.threshold.time": "24h",
"realtime.segment.flush.desired.size": "50M" },``` but it doesn't consume any
events from kafka at all.  
 **@npawar:** You don't need the group id or any of the properties that say
"hlc". Your tables might be out of sync because you've set offset criteria
"largest". Each table will start consuming from the last message in the topic,
so if your rate of events is high, second table will miss out on events that
were emitted between creation of first and second table  
**@bagi.priyank:** I tried with smallest instead of largest first and that's
where I was seeing the difference and then I started using largest after that.
I did see in the code that Pinot uses <table_name>_<timestamp> as a default
group id. I am still confused why I don't see it in the list of consumer
groups. I'll try again today.  
**@npawar:** The concept of consumer group is not used in low level consumer  
**@bagi.priyank:** And I am creating tables in both clusters at the same time
using the same topic. if anything I would expect the difference to be smaller
and not 2-3x of one another as event rate is low.  
**@bagi.priyank:** i see. also forgot to mention that i am using 0.7.1 and
kafka 2.x  
**@bagi.priyank:** how do i use consumer group with high level consumer?
clearly i am missing something when configuring that as well.  
**@bagi.priyank:** do i need to use `stream.kafka.hlc.zk.connect.string` and
`stream.kafka.zk.broker.url` ? i see those in the example table configs in the
github repo for high level consumer. kafka cluster has its own zookeeper, and
each pinot cluster have their own zookeeper as well.  
**@npawar:** you shouldn’t be using high level, and hence shouldn’t have to
worry about consumer group  
**@bagi.priyank:** I see. Could you please go into a little bit about why you
recommend that?  
**@npawar:** we’ve stopped actively developing high-level consumer and would
likely deprecate it soon. All you need for properties is  
**@bagi.priyank:** Got it. Thank you so much once again for your help and
time.  
**@npawar:** still doesn’t solve your missing events issue though..is there a
way for you to run some queries (like min/max timestamps, or count(*) group by
timestamp) to verify that you’re indeed seeing events being missed?  
**@bagi.priyank:** Yeah let me try those queries and share results with you.  
**@bagi.priyank:** i am setting up everything to be able to run those queries.
in the mean time i have few more questions. does low level consumer use group
id by itself? or am i wrong in understanding that it uses a default group id
based on table name and timestamp? if it is doing that, would merely using a
different table name help? if it is using a group id internally i don't
understand why kafka-consumer-groups doesn't show it? i do empty space as one
of the consumer group. if it is not using a group id, then wouldn't the two
tables compete with each other to consume from the same topic in the same
kafka cluster?  
**@bagi.priyank:** output for `select min(upload_time), max(upload_time) from
table` for table with inverted index  
**@bagi.priyank:** output for `select min(upload_time), max(upload_time) from
table` for table with star-tree index  
**@bagi.priyank:** looks like the one with `star-tree` index is lagging
behind.  
**@bagi.priyank:** used `largest` instead of `smallest` and they tend to be
more or less doing similarly well. i think it also helped that i used
different table names for table with inverted index v/s table with star-tree
index. i don't have any proof othe than what i am seeing :joy: . thank you
neha for all the help, your time and patience. much appreciated!  
**@npawar:** oh cool..  
**@npawar:** regarding `does low level consumer use group id by itself? or am
i wrong in understanding that it uses a default group id based on table name
and timestamp? if it is doing that, would merely using a different table name
help? if it is using a group id internally i don't understand why kafka-
consumer-groups doesn't show it? i do empty space as one of the consumer
group.` - we dont use group id even internally.  
**@npawar:** ``if it is not using a group id, then wouldn't the two tables
compete with each other to consume from the same topic in the same kafka
cluster` - Not sure what you mean by 2 tables should compete with each other.
If you’re saying that 2 tables will interfere with each other, such that the
messages they each receive are exclusive to the other - then no, that is not
what happens. We directly consume from offsets inside the pinot-server,
maintaining our own checkpointing  
**@npawar:** this might help:  This talks about how and why we moved away from
high-level to low level, and how it works internally  
**@bagi.priyank:** thank you. i do have questions around consuming from kafka
and offset management. i'll go through this case study first.  
 **@npawar:** @bagi.priyank ^  
 **@rohit:** @rohit has joined the channel  
 **@shreya.chakraborty:** @shreya.chakraborty has joined the channel  
 **@docchial:** @docchial has joined the channel  
 **@bagi.priyank:** The link for `Transform Function in Aggregation Grouping`
is broken on . I am guessing it should be pointing to .  
**@mark.needham:** thanks - will fix  
 **@bagi.priyank:** Also example uses `DATETIME_CONVERT` instead of
`DATETIMECONVERT`  

###  _#flink-pinot-connector_

  
 **@elon.azoulay:** @elon.azoulay has joined the channel  

###  _#minion-improvements_

  
 **@elon.azoulay:** @elon.azoulay has joined the channel  

###  _#fix_llc_segment_upload_

  
 **@elon.azoulay:** @elon.azoulay has joined the channel  
\--------------------------------------------------------------------- To
unsubscribe, e-mail: dev-unsubscribe@pinot.apache.org For additional commands,
e-mail: dev-help@pinot.apache.org