You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pinot.apache.org by Pinot Slack Email Digest <sn...@apache.org> on 2021/06/08 02:00:22 UTC

Apache Pinot Daily Email Digest (2021-06-07)

### _#general_

  
 **@kaushikf9t:** @kaushikf9t has joined the channel  
 **@ragiroux:** @ragiroux has joined the channel  
 **@ramabaratam:** @ramabaratam has joined the channel  
 **@karinwolok1:** To all the Pinot power players here! Wanted to make you
aware of an awesome event coming up next week:  @jackie.jxt @steotia  
 **@sheetalarun.kadam2:** @sheetalarun.kadam2 has joined the channel  

###  _#random_

  
 **@kaushikf9t:** @kaushikf9t has joined the channel  
 **@ragiroux:** @ragiroux has joined the channel  
 **@ramabaratam:** @ramabaratam has joined the channel  
 **@sheetalarun.kadam2:** @sheetalarun.kadam2 has joined the channel  

###  _#feat-presto-connector_

  
 **@ming.liu:** @ming.liu has joined the channel  

###  _#troubleshooting_

  
 **@kaushikf9t:** @kaushikf9t has joined the channel  
 **@shaileshjha061:** HI @mayanks @dlavoie @npawar I got few questions on
Pinot Restore Mechanism... *Can offline tables be converted back to realtime*
after pushing the segments backup from offline table .. and the backup of
realtime table that is stored in pinot .. can it be used to create an offline
table means *is it possible by any means to restore the realtime table data if
gcp service account is changed and vpc is changed?* Kindly help me out in
figuring out restoration on realtime tables. Thanks in Advance. *CC:
@nadeemsadim* *@mohamed.sultan* *@mags.carlin*  
**@patidar.rahul8392:** @shaileshjha061 For this use can u can try hybrid
table.it's a combination of realtime and offline table, and then u can push
the segment in hybrid table.  
**@shaileshjha061:** @patidar.rahul8392 Can you suggest or share docs link? To
push the segment back in hybrid table of offline table. CC: @nadeemsadim  
**@patidar.rahul8392:**  
 **@kaushikf9t:** Just started exploring Apache Pinot on setting it up on AWS
using the documentation here -  After creating an EKS cluster and installed
Pinot on top of it using Helm, using this document -  ** ** This guide
provides a quick start for running Pinot on Amazon Web Services (AWS). ** **
Pinot quick start in Kubernetes  
 **@kaushikf9t:** When I do a kubectl get all -n pinot-quickstart, I see this
has brought up classic load balancers to expose both the broker and the
controller on tcp ports, and when I make a curl/browser request to the DNSs, I
expect these to show up the UI for the broker and the Swagger UI for the
controller, but the request times out eventually without bringing up the UI. I
am a beginner in AWS Networking as well, but the security groups created by
these setup instructions which I have followed exactly allows TCP requests
from 0.0.0.0/0. Any inputs on bringing the UI up for the broker and server are
much appreciated!  
 **@ragiroux:** @ragiroux has joined the channel  
 **@joshhighley:** I have a realtime table consuming messages from a 3
partition Kafka topic. Possibly due to some network issues over the weekend,
all 3 consumers are repeating the same error messages about a bad offset:
```2021/06/07 15:24:27.918 INFO [Fetcher] [agent_daily__2__2__20210605T0819Z]
[Consumer clientId=consumer-71, groupId=] Fetch offset 22 is out of range for
partition agent_daily-2, resetting offset 2021/06/07 15:24:27.919 INFO
[Fetcher] [agent_daily__2__2__20210605T0819Z] [Consumer clientId=consumer-71,
groupId=] Resetting offset for partition agent_daily-2 to offset 5. 2021/06/07
15:24:27.938 INFO [Fetcher] [agent_daily__1__2__20210605T0819Z] [Consumer
clientId=consumer-73, groupId=] Fetch offset 20 is out of range for partition
agent_daily-1, resetting offset 2021/06/07 15:24:27.939 INFO [Fetcher]
[agent_daily__1__2__20210605T0819Z] [Consumer clientId=consumer-73, groupId=]
Resetting offset for partition agent_daily-1 to offset 0. 2021/06/07
15:24:27.942 INFO [Fetcher] [agent_daily__0__2__20210605T0819Z] [Consumer
clientId=consumer-72, groupId=] Fetch offset 24 is out of range for partition
agent_daily-0, resetting offset 2021/06/07 15:24:27.943 INFO [Fetcher]
[agent_daily__0__2__20210605T0819Z] [Consumer clientId=consumer-72, groupId=]
Resetting offset for partition agent_daily-0 to offset 1. 2021/06/07
15:24:33.018 INFO [Fetcher] [agent_daily__2__2__20210605T0819Z] [Consumer
clientId=consumer-71, groupId=] Fetch offset 22 is out of range for partition
agent_daily-2, resetting offset 2021/06/07 15:24:33.018 INFO [Fetcher]
[agent_daily__2__2__20210605T0819Z] [Consumer clientId=consumer-71, groupId=]
Resetting offset for partition agent_daily-2 to offset 5. 2021/06/07
15:24:33.038 INFO [Fetcher] [agent_daily__1__2__20210605T0819Z] [Consumer
clientId=consumer-73, groupId=] Fetch offset 20 is out of range for partition
agent_daily-1, resetting offset 2021/06/07 15:24:33.039 INFO [Fetcher]
[agent_daily__1__2__20210605T0819Z] [Consumer clientId=consumer-73, groupId=]
Resetting offset for partition agent_daily-1 to offset 0. 2021/06/07
15:24:33.042 INFO [Fetcher] [agent_daily__0__2__20210605T0819Z] [Consumer
clientId=consumer-72, groupId=] Fetch offset 24 is out of range for partition
agent_daily-0, resetting offset 2021/06/07 15:24:33.043 INFO [Fetcher]
[agent_daily__0__2__20210605T0819Z] [Consumer clientId=consumer-72, groupId=]
Resetting offset for partition agent_daily-0 to offset 1.``` The 'reset'
offsets of 5, 0, and 1 are correct: I created a new 'test' table for the same
topic and it used those offsets with no issue. I've tried disabling/enabling
the table but it resumes those error messages. Is there some other way to
reset the table consumers?  
**@mayanks:** Sounds like an issue on Kafka side might have made Pinot's
offset state inconsistent?  
**@mayanks:** If so, then the only way I can think of is to delete and
recreate the table in Pinot.  
**@joshhighley:** but I can create a new table against the same topic, and not
have the issue. Is there any other way to reset Pinot's offset state? Luckily
this is a dev environment -- deleting and re-creating the table isn't really
an option in production  
**@joshhighley:**...Pinot seems to know what the offset should be, since it
logs it. It doesn't seem to use that offset, though  
**@mayanks:** If the offsets are corrupted on the kafka side, then it is a
kafka side issue that needs to be prevented in production?  
**@mayanks:** Would be good to understand the root cause here first, if we
want to avoid it in production.  
**@mayanks:** If it turns out to be a pinot side issue, then yes definitely we
should have a way to recover gracefully.  
**@mayanks:** BTW, the log message is from kafka consumer.  
**@joshhighley:** I can create a new table subscribed to the same topic and
receive messages while the existing table still does not. If there was a Kafka
issue at some point then it appears corrected  
**@mayanks:** What I am saying is that Pinot stores offsets as it commits
segments. If a disruptive change on kafka side makes these offsets
inconsistent with what Pinot had seen earlier, then the kafka consumer inside
of Pinot will run into this state.  
**@mayanks:** When you start a new table, the new table is seeing all
consistent offsets (as it doesn't have anything saved).  
**@mayanks:** Can you try to restart the Pinot servers and see if that helps?  
**@mayanks:** My suspicion is that after Pinot committed segments and saved
the offsets for them, there was a disruptive change in kafka side due to which
these saved offsets are out-of-sync with Kafka.  
**@ssubrama:** See what the segment metadata says regarding the offset where
the segment is supposed to start consuming. Pinot uses that offset. Also, have
you what is your kafka retention time? Maybe increasing it a little will solve
your problem  
 **@ramabaratam:** @ramabaratam has joined the channel  
 **@nadeemsadim:** Hi .. Any docs/video for pushing the tar backup on gcs to
hybrid/offline table  
**@mayanks:** Have you looked at:  
**@sheetalarun.kadam2:** @sheetalarun.kadam2 has joined the channel  

###  _#pinot-dev_

  
 **@ming.liu:** @ming.liu has joined the channel  

###  _#getting-started_

  
 **@kaushikf9t:** @kaushikf9t has joined the channel  
 **@kaushikf9t:** Just started exploring Apache Pinot on setting it up on AWS
using the documentation here -  After creating an EKS cluster and installed
Pinot on top of it using Helm, using this document -  
**@kaushikf9t:** When I do a kubectl get all -n pinot-quickstart, I see this
has brought up classic load balancers to expose both the broker and the
controller on tcp ports, and when I make a curl/browser request to the DNSs, I
expect these to show up the UI for the broker and the Swagger UI for the
controller, but the request times out eventually without bringing up the UI. I
am a beginner in AWS Networking as well, but the security groups created by
these setup instructions which I have followed exactly allows TCP requests
from 0.0.0.0/0. Any inputs on bringing the UI up for the broker and server are
much appreciated!  
 **@ming.liu:** @ming.liu has joined the channel  

###  _#pinot-docsrus_

  
 **@mayanks:** @jlli Could you please help with converting the google doc
design doc for segment sorting/partitioning into docs?  
 **@jlli:** @jlli has joined the channel  
 **@nadeemsadim:** Any docs/video for pushing the tar backup on gcs to
hybrid/offline table  
\--------------------------------------------------------------------- To
unsubscribe, e-mail: dev-unsubscribe@pinot.apache.org For additional commands,
e-mail: dev-help@pinot.apache.org