You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pinot.apache.org by Pinot Slack Email Digest <sn...@apache.org> on 2021/03/10 02:00:17 UTC

Apache Pinot Daily Email Digest (2021-03-09)

### _#general_

  
 **@sellieyoung:** @sellieyoung has joined the channel  
 **@prabhuom:** @prabhuom has joined the channel  
 **@ftisiot:** @ftisiot has joined the channel  
 **@dubin555:** @dubin555 has joined the channel  
 **@tech.jayaram:** @tech.jayaram has joined the channel  
 **@tech.jayaram:** what is this APAche Pinot all about  
**@mayanks:** The docs would be a good starting place to answer that  
**@karinwolok1:** Hey @tech.jayaram! Great to have you hear! Curious to learn
how you got here without knowing much about Pinot. :slightly_smiling_face:
There are also intro level resources on our YouTube, including this:  
**@karinwolok1:** In case you missed it :wine_glass: We hosted a meetup last
week with@tingchen, software engineer in Uber’s Data team and Pinot
contributor, where he reviews Uber's data ecosystem and their use of Apache
Pinot. Recording is now officially live! :tada:  
**@karinwolok1:** :wave: Welcome to the newest :wine_glass: Apache Pinot
community members!! :people_holding_hands: Would love to know how you found
the Pinot community and what you're working on! Please introduce yourself if
you haven't already! :smile: @m.h.dugas @sellieyoung @prabhuom @ftisiot
@tech.jayaram @kundu.abhishek @avinashnayak @girishbhat.m7 @calvin.mwenda
@amommendes @neilteng233 @ustela101 @anumukhe @vsriva @phuchdh @dutta.kinshuk
@vmadhira @ita.pai @zyedmohammedanees @joshhighley @ratish1992 @adilsonbna
@tsjagan @1705ayush @juraj.komericki  
 **@m.h.dugas:** Hi everyone! I'm Michael (he/him). Thanks so much for the
warm welcome!:smile: Excited to learn more  
 **@ratchetmdt:** @ratchetmdt has joined the channel  

###  _#random_

  
 **@sellieyoung:** @sellieyoung has joined the channel  
 **@prabhuom:** @prabhuom has joined the channel  
 **@ftisiot:** @ftisiot has joined the channel  
 **@dubin555:** @dubin555 has joined the channel  
 **@tech.jayaram:** @tech.jayaram has joined the channel  
 **@ratchetmdt:** @ratchetmdt has joined the channel  

###  _#troubleshooting_

  
 **@sellieyoung:** @sellieyoung has joined the channel  
 **@prabhuom:** @prabhuom has joined the channel  
 **@prabhuom:** Hi Everyone, I am trying to ingest the data from kerberos
enable kafka cluster. Could you please help me in how to pass kerberos
principal and keytab to streamConfig  
**@g.kishore:** It’s very similar to instantiating Kafka consumer.. add the
properties in the stream config section in the table config  
 **@ftisiot:** @ftisiot has joined the channel  
 **@girishbhat.m7:** I am following the pinot docs for batch importing from
the gcs bucket. getting the below error ```java.lang.IllegalStateException:
PinotFS for scheme: gs has not been initialized at
shaded.com.google.common.base.Preconditions.checkState(Preconditions.java:518)
~[pinot-all-0.7.0-SNAPSHOT-jar-with-
dependencies.jar:0.7.0-SNAPSHOT-513702582e620829419a93c322740a7193e941c3] at
org.apache.pinot.spi.filesystem.PinotFSFactory.create(PinotFSFactory.java:80)
~[pinot-all-0.7.0-SNAPSHOT-jar-with-
dependencies.jar:0.7.0-SNAPSHOT-513702582e620829419a93c322740a7193e941c3] at
org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner.init(SegmentGenerationJobRunner.java:94)
~[pinot-batch-ingestion-standalone-0.7.0-SNAPSHOT-
shaded.jar:0.7.0-SNAPSHOT-513702582e620829419a93c322740a71``` I have taken the
master branch built the source code . Is there any configuration required in
classpath  
**@g.kishore:** How are you running it? Do you have gcs plug-in in your
classpath  
**@girishbhat.m7:** yes... I checked the Java_all_opts variable in script
...it is using the plugin directory from the build.  
**@girishbhat.m7:** I am not facing this issue if I use pre compiled binary  
 **@dubin555:** @dubin555 has joined the channel  
 **@tech.jayaram:** @tech.jayaram has joined the channel  
 **@falexvr:** Guys! I wonder if you had already implemented any aggregated
function that can let us take a column and then concat its values into a
string, my use case is that I want to know per minute which countries saw our
streaming events, but I'm needing it on a flat string instead of having it in
several rows, is that possible right now? I tried doing this but it didn't
work: `CONCAT(DISTINCT(COUNTRY_CODE))` Also tried this:
`groovy('{"returnType":"STRING","isSingleValue":true}',
'arg0.toList().join(",")', DISTINCT(COUNTRY_CODE))` But it didn't work either  
 **@falexvr:** It seems `DISTINCT` can't be used to pass multiple values into
UDF's, is there any way to do so?, like in a grouped query, we have grouped
all events per minute and now want to do something like that  
 **@g.kishore:** thats right distinct output represent multiple rows and
existing post aggregation functions can only act of one row  
 **@g.kishore:** please file an issue and we can add some udf's to support
this  
**@falexvr:** Nice, in the mean time do you have any docs for us to try to
accelerate things up a bit by implementing something from our side?  
**@g.kishore:** yes, you can -  
**@g.kishore:** post aggregation function is much simpler  
**@falexvr:** Thanks a lot  
**@g.kishore:** This is another article from @kharekartik  
 **@1705ayush:** Hi all, I use helm and kubernetes approach to start pinot on
minikube. I dont know why does my pinot-broker takes too much time and
multiple restarts to start running and get into READY state. Is there any
config that I am missing out here. For example, ```$ kubectl -n my-pinot-kube
get all NAME READY STATUS RESTARTS AGE pod/pinot-broker-0 0/1 Running 3 6m52s
pod/pinot-controller-0 1/1 Running 1 6m51s pod/pinot-server-0 1/1 Running 1
6m51s pod/pinot-zookeeper-0 0/1 Running 2 6m51s``` Here is the log when the
pinot-broker crashed and restarted. ```$ kubectl -n my-pinot-kube logs pinot-
broker-0 -f 2021/03/09 20:33:27.866 INFO [HelixBrokerStarter] [Start a Pinot
[BROKER]] Starting Pinot broker 2021/03/09 20:33:27.879 INFO
[HelixBrokerStarter] [Start a Pinot [BROKER]] Connecting spectator Helix
manager 2021/03/09 20:34:02.569 INFO [HelixBrokerStarter] [Start a Pinot
[BROKER]] Setting up broker request handler 2021/03/09 20:34:36.312 WARN
[ZKHelixManager] [ZkClient-EventThread-14-pinot-zookeeper:2181]
KeeperState:Disconnected, SessionId: 100012444580000, instance: Broker_pinot-
broker-0.pinot-broker-headless.my-pinot-kube.svc.cluster.local_8099, type:
SPECTATOR```  
**@dlavoie:** Broker, server and controller will crash and restart until
zookeeper is ready  
 **@dlavoie:** that’s “normal”  
 **@fx19880617:** One thin we may try is the helm pre-install hook  
**@fx19880617:** to split the installation to zk first then other components  
 **@ratchetmdt:** @ratchetmdt has joined the channel  
\--------------------------------------------------------------------- To
unsubscribe, e-mail: dev-unsubscribe@pinot.apache.org For additional commands,
e-mail: dev-help@pinot.apache.org