You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pinot.apache.org by Pinot Slack Email Digest <sn...@apache.org> on 2021/07/02 02:00:33 UTC

Apache Pinot Daily Email Digest (2021-07-01)

### _#general_

  
 **@saravana6m:** @saravana6m has joined the channel  
 **@keweishang:** Hi @yupeng, a question about real-time table’s upsert
feature: would it be possible to use other columns (e.g. an int column or
Kafka offset) than the event-time column (configured by `timeColumnName` in
`segmentsConfig` table config) to decide which is the last version when the
primary key is the same? I’m asking because Kafka Streams JOIN produces many
records with the same event-time column value (in our case, the `last_update`
column). So in Pinot, the last version is randomly chosen amongst these
records with the same event-time. Thanks  
**@yupeng:** yes, there’s an issue on this  
**@yupeng:** technically it’s doable. could you add your use case to the issue
and upvote it?  
**@keweishang:** Thank you. I’ll add my use case to the issue and upvote it.
Is there any workaround to this at the moment?  
**@yupeng:** if the issue happens on the tie of the event time, you can also
use the processing time  
**@yupeng:** kafka broker can tag a timestamp of the event recevied on the
broker  
**@yupeng:** if you can use that as the event time  
**@keweishang:** Good point. We’ll see if it’s a legit workaround for us.
Thanks :+1:  
**@yupeng:** sure thing :slightly_smiling_face:  
 **@g.kishore:** Hello All, We need your inputs as we are thinking about the
big features to add in Pinot. we have avoided implementing joins in Pinot and
have always referred folks to use Presto/Spark to achieve joins on top of
Pinot. However, we are seeing contributions from Uber on lookup join and
requests from users to support native join support in Pinot. Is this something
that will benefit existing users of Pinot. How do you handle joins • :one: We
dont need it since we pre-join the data before pushing it to Pinot • :two: We
use Presto/Trino and we are happy with Presto/Trino • :three: We would LOVE to
see Pinot support JOIN Please vote  
**@jmeyer:** It's hard to refuse such a feature (from a user's perspective)
but surely if it was not done earlier, there must exist strong reasons
"against" it - so I wonder what cost this feature would incur (e.g. in terms
of other features being pushed back) Anyway, I get that your question is only
on how people are doing it *right now* :slightly_smiling_face:  
**@g.kishore:** we typically follow this list which we published in Dec 2020.
The community has doubled since then, may be we should do another survey to
include the new users of Pinot  
**@jmeyer:** Ah interesting, thanks for sharing  
**@g.kishore:** more than other features, we are concerned about designing
join in bits and pieces.. lookup join, subquery, colocated joins, window
functions, equality joins, inner joins only .. etc IMO, it's better to design
for next few years but implementation can be in bits and pieces  
**@jmeyer:** I think I get what you mean, that would also help in having a
more standard & consistent query language rather than having related but
separate smaller features But that would make planning and initial development
a lot more challenging I guess..  
**@ken:** Joins would be great, but as @jmeyer said, “versus what else?“. And
I’d respond with a `1`, since we denormalize via a Flink workflow, but it
would significantly reduce our data footprint if we weren’t replicating lots
of data between rows because of denormalization. Also agree that there are
lots of possible definitions for what it means to “support joins”.  
**@ken:** And what about my personal favorite, removing the Zookeeper
requirement? :slightly_smiling_face:  
**@g.kishore:** Pinot depends on Helix which needs ZK. it will be a big
undertaking and no one has complained about ZK.  
**@g.kishore:** Is your concern about ZK or using a central config store. in
other words, you prefer etcd over ZK or you are referring to removal of
central metadata store  
**@ssubrama:** @ken I am also curious to know what your motivation is behind
removing zookeeper requirements. What about zookeeper is the issue that you
see?  
**@ken:** @ssubrama In my experience it’s been hard for ops teams to have a
stable, performant Zookeeper cluster. E.g. we’re wrestling with an issue now
where if we do a metadata push of a 1000 segments, the Zookeeper cluster goes
down (there’s a weird file permission error when writing to the WAL that shows
up in the ZK logs). It looks like one problem was that the Pinot cluster was
only configured with one of the three ZK servers, but still…  
**@ken:** The odd thing is that over the years I’ve asked ops people I run
into at conferences about ZK, and it seems like 50% say its no problem, and
50% hate it with a passion  
**@ssubrama:** thanks for clarification. Metadata push of a 1000 segments
simultaneously is something that we have not attempted. It is usually a few
segments at a time. But, we have seen Helix's use of Zk to be a bottleneck
when there are 1000s of tables (each with 1000s or 10s of 1000s of segments),
especially during server deployment.  
**@ken:** I know that Ververica now ships their platform with an HA mode (only
for k8s) that removes the need for ZK  
**@ssubrama:** Not familiar with Ververica. I will look it up  
**@ken:** Ververica is the main company behind Flink - so it’s their Flink
platform, where HA mode means maintaining state across multiple Job Managers  
**@ken:** See  
**@ken:** Cassandra uses Paxos (and their Gossip protocol, I guess) vs.
Zookeeper  
**@yupeng:** which version of pinot do you use? i added a patch in 0.6 to cap
the throughputs on zk activities to address issues like this at uber  
**@ken:** 0.7.1  
**@ken:** How can I adjust that cap? Seems interesting…  
**@yupeng:**  
**@yupeng:** the default is set to a large number  
**@yupeng:** and at uber, we set a a much smaller one  
**@ken:** OK - any suggestions for what to use, for a small (3 node) ZK
cluster with non-SSD drives? This is for our beta cluster. I’m thinking 1K
:slightly_smiling_face:  
**@yupeng:** 1k might be too low, you can try 10k  
**@yupeng:** it’s more a like rate limiter  
**@ken:** I saw the lengthy discussion on your PR. Wondering if as per “BTW,
Helix team does suggest to use throttling to controller the number of
messages. For the number of the threshold, it is up to Pinot team to discuss.”
there are other settings we should also adjust  
**@ken:** E.g. `jute.maxbuffer`, if we’re using a lower value for max messages  
**@yupeng:** we set it to a large number at uber  
**@yupeng:** 40MB  
**@ken:** So which of the configs (controller, broker, server) should get the
updated `pinot.helix.instance.messages.max` setting? Oh, wait, the name got
changed to `pinot.helix.instance.state.maxStateTransitions`. Looks like it’s
only used in the controller  
**@yupeng:** Yes  
 **@g.kishore:** we typically follow this list which we published in Dec 2020.
The community has doubled since then, may be we should do another survey to
include the new users of Pinot  
 **@d.chang:** @d.chang has joined the channel  
 **@jaykhatra21:** @jaykhatra21 has joined the channel  
 **@mark.frenette:** @mark.frenette has joined the channel  

###  _#random_

  
 **@saravana6m:** @saravana6m has joined the channel  
 **@d.chang:** @d.chang has joined the channel  
 **@jaykhatra21:** @jaykhatra21 has joined the channel  
 **@mark.frenette:** @mark.frenette has joined the channel  

###  _#troubleshooting_

  
 **@saravana6m:** @saravana6m has joined the channel  
 **@mohamed.sultan:** Hi team, I'm getting this error in broker, controller,
minion, server. Kindly help on this.  
**@luis.muniz:** You're probably using the wrong version of the java VM in
your PATH, Check the requirements of Pinot and make sure you have that Java
version instralled and `java -version`shows this version. (you have to change
the PATH and the JAVA_HOME variable)  
**@mohamed.sultan:** This is in GKE environment  
**@luis.muniz:** is a good tool to install and switch between java VMs  
**@luis.muniz:** ah  
**@luis.muniz:** can't help you there, sorry  
**@mohamed.sultan:** anyways, this is because of java version mismatch?  
**@mohamed.sultan:** CC: @nadeemsadim  
**@luis.muniz:** it looks like you are using a switch that is not known by
your java VM, that would be my first hypothesis  
**@mohamed.sultan:** @mayanks??  
**@xiangfu0:** I think you can remove this option from the jvmOps in
`values.yaml` file  
**@xiangfu0:** or just change the image tag to `latest-jdk11`  
**@mohamed.sultan:** ok let me try  
**@mohamed.sultan:** I have tried with image tag, same issue  
**@mohamed.sultan:** I'll try to remove JVMOpts line from values.yaml  
**@mohamed.sultan:** now  
**@mohamed.sultan:** @mags.carlin!  
**@nadeemsadim:** @xiangfu0 we can see the latest pinot image is updated on
docker hub for latest tag and causing some trouble*(could not create jvm
..unrecognized vm option 'PrintGCDateStamps')* .. is it related to the latest
migration to java11 for pinot?or is there some issue in gke cluster which is
causing this error in jvm start .. please guide  
**@xiangfu0:** latest image is built in jdk8, I think helm scripts is using vm
options available in java 11  
**@nadeemsadim:** is there some tag for jdk8 or some release on dockerhub
which is stable for other java versions .. we dont see the history of the
pinot images that were earlier pushed on dockerhub and the image is replaced
.. the older image was working fine in old gke cluster .. once we moved to a
new gke cluster .. facing issue i guess. . @mohamed.sultan please add  
**@nadeemsadim:** latest image is built in jdk8, I think helm scripts is using
vm options available in java 11 -->ohhk ..what will be the vm options for jdk8
@xiangfu0  
**@xiangfu0:** Can you check git commit history and see the change of java 11
upgrade  
**@nadeemsadim:** the image on dockerhub was updated almost 10 hours ago .. is
that the root cause of the issue we are facing maybe because of java version
changed  
**@xiangfu0:** Default is jdk8  
**@xiangfu0:** Jdk11 images have a suffix in the tag  
**@nadeemsadim:** @mohamed.sultan tried with that .. but again same issue ..
is there some tag for jdk8 image  
**@nadeemsadim:** atleast latrest or latest-jdk11 should work .. means one out
of these two  
**@xiangfu0:** Latest image is jdk8  
**@xiangfu0:** Latest-jdk11 is jdk 11  
**@nadeemsadim:** ok so for latest-jdk11 .. all vm options must be removed|  
**@nadeemsadim:** means no xmx and xms shhould be provided and then tested?  
**@xiangfu0:** No  
**@xiangfu0:** You just change the image tag  
**@xiangfu0:** From latest to latest-jdk11 and try  
**@nadeemsadim:** yeah .. we tried that but it didnt resolved  
**@xiangfu0:** No need to modify the jvmOpts  
**@nadeemsadim:** ok  
**@xiangfu0:** Oh?  
**@xiangfu0:** What’s the issue  
**@nadeemsadim:** let me recheck with @mohamed.sultan  
**@mohamed.sultan:** I have changed to latest-jdk11 and removed jvmopts line
from values.yaml  
**@mohamed.sultan:** I think there is no issue now  
**@xiangfu0:**  
**@xiangfu0:** This is the change from old jvmOpts for jdk8  
**@xiangfu0:** Just FYI  
**@nadeemsadim:**  
**@nadeemsadim:** i think there is some bug in latest pull  
**@nadeemsadim:** 28 Actually, Kafka works fine with newer versions of Java. I
had the same problem, and found an error in the `kafka/bin/kafka-run-class.sh`
script, where the Java version was incorrectly parsed. This line grabs too
much of the version string: ```JAVA_MAJOR_VERSION=$($JAVA -version 2>&1 | sed
-E -n 's/.* version "([^.-]*).*"/\1/p')``` This makes the `if [[
"$JAVA_MAJOR_VERSION" -ge "9" ]]` condition fail to identify the correct Java
version, and adds some unsupported GC options. Changing the line above to this
solved my problem: ```JAVA_MAJOR_VERSION=$($JAVA -version 2>&1 | sed -E -n
's/.* version "([^.-]*).*/\1/p')```  
**@nadeemsadim:** can u confirm @xiangfu0  
**@nadeemsadim:** cc: @dlavoie @mayanks @ken @hussain  
**@mohamed.sultan:** CC: @shaileshjha061  
**@xiangfu0:** This is always override by helm  
**@xiangfu0:** JvmOpts  
**@xiangfu0:** Check helm repo  
**@xiangfu0:** This will override default jvm opts from admin script  
**@nadeemsadim:** @xiangfu0 @mohamed.sultan  
**@nadeemsadim:** @xiangfu0: for latest-jdk11 .. pinot is working fine on gke
.. only concern is we are not providing any jvm options in helm .. so from
which file will it be picking default jvm options and how to give our own jvm
options since its failing if we provide it in helm.. as we need to check
metrics in prometheus and also increase xmx  
**@xiangfu0:** you can provide the jvm opts  
**@xiangfu0:** just remove the ones that complaining  
**@xiangfu0:** it will override the existing jvmopts  
 **@jmeyer:** Hello :slightly_smiling_face: In Pinot's Helm chart, is there
any reason to have `.Values.{controller|broker}.external.enabled` set to
`true` by default ? Maybe having it to `false` would be a safer alternative
for first time users not knowing the chart well enough yet  
**@dlavoie:** Hello! Chart defaults are intended for quick starts. Gence a
public service makes it easier to access it after you’ve just deployed it. I
agree non exposed is safer.  
**@jmeyer:** Yeah I get that some chart defaults are more "quick test"
oriented while others can be more prod oriented I guess the question now is,
what should we do for Pinot :smile: Maybe have separate values files ?  
**@dlavoie:** Helm doesn’t really have a notion of “profiles”. So the best we
can do is document recommended production ready helm values.  
**@jmeyer:** My helm knowledge is (very) limited but that sounds like a step
in a good direction  
 **@jmeyer:** Anyone has an idea of a workaround for that issue ?  
**@jmeyer:** Apart from mapping the `ids` to some datetimes, which sounds
pretty dodgy  
 **@d.chang:** @d.chang has joined the channel  
 **@jaykhatra21:** @jaykhatra21 has joined the channel  
 **@jainendra1607tarun:** Hello Team, I am observing a few issues - 1) I
changed the replicas for a table from 1 to 3. All the segments that are
created after the change have 3 replicas, but the pre-existing segments
continue to have only 1 replica. 2) When a server node goes down, its hosted
segments are not created on another node to maintain 3 replicas for the
segments. 3) When I add a new server, no redistribution of segments happen
automatically. What are the expectations from Pinot in the above cases?  
**@npawar:** you have to run a rebalance  
**@npawar:**  
**@npawar:** for pre existing segments to increase replicas, and also for
redistribution of preexisting segments onto the new servers  
**@npawar:** 2 is also not expected to happen automatically. the replication
factor is expected to cover for that  
**@jainendra1607tarun:** @npawar Thanks for the clarification. Is there any
recommendation for the number of replicas in a 10 machine cluster ?  
**@npawar:** no recommendation as such. Practically i’ve seen production
evnironments set it to 3  
**@npawar:** usually depends on, if you encounter a node failure, then all
your query load will go to the remaining replica servers. So can your server
handle say 50% more load, or 33% more load, etc  
 **@mark.frenette:** @mark.frenette has joined the channel  

###  _#pinot-dev_

  
 **@saravana6m:** @saravana6m has joined the channel  
 **@luis.muniz:** @luis.muniz has joined the channel  
 **@atri.sharma:** Hi folks, need some help  
 **@atri.sharma:** I have added a new parameter in QueryExecutorConfig, which
I wish to set in a new unit test I am writing  
 **@atri.sharma:** Is there a pattern which I can follow to set and unset the
parameter in the test only?  
 **@g.kishore:** use the serve.conf if you want to control this as start/stop
of service, you can use query options if you want it more dynamic on a per
query basis  
 **@atri.sharma:** Is there an example? I want to localise this parameter's
value to the test class only  
 **@g.kishore:** ```QueryExecutorTest```  
**@g.kishore:** ```PropertiesConfiguration queryExecutorConfig = new
PropertiesConfiguration();
queryExecutorConfig.setDelimiterParsingDisabled(false);
queryExecutorConfig.load(new File(resourceUrl.getFile())); _queryExecutor =
new ServerQueryExecutorV1Impl(); _queryExecutor.init(new
PinotConfiguration(queryExecutorConfig), instanceDataManager,
_serverMetrics);```  
**@atri.sharma:** Thank you!  

###  _#getting-started_

  
 **@jaykhatra21:** @jaykhatra21 has joined the channel  
 **@jaykhatra21:** Hi, newbie here! is there a documentation on how to set up
the debug configurations (in Intellij) for local deployment of pinot?  
 **@g.kishore:**  
 **@jaykhatra21:** thanks  
 **@g.kishore:** added instructions to quickstart pinot from IDE as well  

###  _#pinot-trino_

  
 **@xiangfu0:** @xiangfu0 has joined the channel  
 **@elon.azoulay:** @elon.azoulay has joined the channel  
 **@xd:** @xd has joined the channel  
 **@xiangfu0:** Hi @elon.azoulay just wanna check if this bug is fixed:
```java.lang.NullPointerException: null value in entry: Server_ps-
fw-1062.service.consul_8098=null at
com.google.common.collect.CollectPreconditions.checkEntryNotNull(CollectPreconditions.java:32)
at
com.google.common.collect.SingletonImmutableBiMap.<init>(SingletonImmutableBiMap.java:42)
at com.google.common.collect.ImmutableBiMap.of(ImmutableBiMap.java:71) at
com.google.common.collect.ImmutableMap.of(ImmutableMap.java:124) at
com.google.common.collect.ImmutableMap.copyOf(ImmutableMap.java:459) at
com.google.common.collect.ImmutableMap.copyOf(ImmutableMap.java:438) at
io.prestosql.pinot.PinotSegmentPageSource.queryPinot(PinotSegmentPageSource.java:207)
at
io.prestosql.pinot.PinotSegmentPageSource.fetchPinotData(PinotSegmentPageSource.java:176)
at
io.prestosql.pinot.PinotSegmentPageSource.getNextPage(PinotSegmentPageSource.java:144)
at
io.prestosql.operator.ScanFilterAndProjectOperator$ConnectorPageSourceToPages.process(ScanFilterAndProjectOperator.java:376)
at
io.prestosql.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:372)
at
io.prestosql.operator.WorkProcessorUtils.getNextState(WorkProcessorUtils.java:221)```  
**@xiangfu0:** seeing this in 350  
 **@elon.azoulay:** What's the query that caused it?  
 **@elon.azoulay:** Is it the mixed case table name issue?  
 **@xiangfu0:** I remember this is due to empty results from the pinot server  
 **@xd:** we are just `select * from table limit 10`  
 **@xiangfu0:** hmm  
 **@xd:** or with simple filter  
 **@xiangfu0:** oh  
 **@xiangfu0:** does your table name has upper case?  
 **@xd:** no  
 **@elon.azoulay:** any mixed case in columns? Latest trino has fixes for that  
 **@xd:** `select * from pinot.default.point_transaction limit 10`  
 **@xd:** no  
 **@elon.azoulay:** it point_transaction realtime or offline or hybrid?  
 **@xd:** hybrid  
 **@xd:** oh, that’s the reason?  
 **@xd:** `point_transaction_OFFLINE`?  
 **@elon.azoulay:** no, it should work, just trying to get some more context:)  
 **@elon.azoulay:** No that should be fine  
 **@elon.azoulay:** When you do `show tables` does your table come up?  
 **@elon.azoulay:** hybrid tables will combine the results (just like a broker
query)  
 **@xd:** `SHOW TABLES from pinot.default` returns ```point_entry
point_transaction```  
 **@elon.azoulay:** Can you try ```select * from pinot.default."select * from
point_transaction"```  
 **@elon.azoulay:** oh, this looks like an older version of the connector
(from the stack trace)  
 **@elon.azoulay:** Would you be able to try with latest trino?  
 **@elon.azoulay:** There have been a lot of fixes since then...  
 **@xd:** yes, it could be. We are only in 350  
 **@xd:** Let me poke around and also try out  
 **@xd:** `select * from pinot.default."select * from point_transaction"`
works!  
 **@xd:** Means we need to upgrade?  
 **@elon.azoulay:** Yep, if you can  
 **@xd:** Sure, thanks a lot! Appreciate it  
 **@elon.azoulay:** keep us posted, if any issues I'll help  
\--------------------------------------------------------------------- To
unsubscribe, e-mail: dev-unsubscribe@pinot.apache.org For additional commands,
e-mail: dev-help@pinot.apache.org