You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pinot.apache.org by Pinot Slack Email Digest <ap...@gmail.com> on 2022/05/13 02:00:30 UTC

Apache Pinot Daily Email Digest (2022-05-12)

### _#general_

  
 **@saumya2700:** Do we have any option to save pinot ingestion time in table
so that we know if any latency in table while ingesting data from kafka.  
**@mayanks:** Not atm, note that replicas ingest independent of each other.
cc: @npawar @navi.trinity  
**@npawar:** you could add a transformFunction `now()` on a new column. But
with multiple replicas, there’s no guarantee about data consistency across
replicas. This feature would be useful for you once it is ready:  
**@akashsaluja:** @akashsaluja has joined the channel  
 **@octchristmas:** I have created a batch pipeline that stores datafiles from
cloudera impala parquet table to pinot cluster. How to gracefully swap
segments if the number of input files gets smaller? Like this:
`segment.name.prefix : normalizedDate` `exclude.sequence.id : false` \-- input
data file ``` ``` \-- pinot segment ``` ``` \------------------------------ If
I redo the batch and the data file is reduced to two: \-- input data file ```
``` `segment.name: fixed` If I have to use the 'segment.name:fixed' setting,
how can I gracefully delete the segment 'batch_2022-05-12_2022-05-12_2'?  
**@ken:** It’s not atomic, but you could push the two new segments, and delete
the third segment using the UI or REST API. Or for a real hack, create a
segment with no data that has the same name as the third segment.  
**@npawar:** if you can generate unique names every time (prolly wiht a
prefix), you may be able to use the startSegmentReplace endSegmentReplace
constructs. @snlee is this somewhere we can use that?  
 **@dnanassy:** @dnanassy has joined the channel  
 **@filipdolinski:** @filipdolinski has joined the channel  
 **@tanmay.movva:** Hello! Is there a php client available for Pinot?  
**@mitchellh:** Hi, There is not a native PHP client. However the apache pinot
end points are HTTP based. You can use the swagger docs link in the UI to see
all of the RESTful endpoints  
**@tanmay.movva:** Got it. We are doing the same now. Just wanted to check if
there is a client available. Thanks! @mitchellh  
 **@g.kishore:** If you are around and would love to know how Cisco Webex is
powering End-user analytics using Pinot - you can join  
**@harshininair14:** @harshininair14 has joined the channel  

###  _#random_

  
 **@akashsaluja:** @akashsaluja has joined the channel  
 **@dnanassy:** @dnanassy has joined the channel  
 **@filipdolinski:** @filipdolinski has joined the channel  
 **@tonya:** Starting in 20minutes! :computer:  
**@harshininair14:** @harshininair14 has joined the channel  

###  _#troubleshooting_

  
 **@saumya2700:** We are facing very few times latency issue in pinot, so I
tried to see from Grafana and grafana is showing very high latency but this is
not the case. Why graph is showing that much latency this is giving wrong
impression. Is there anything we need to change in Table Consuming Latency
graph,I added the graphs as it is in Grafana as mentioned in monitoring link
in pinot documents :  
**@saurabhd336:** Can you share the promql query you're plotting?  
**@saumya2700:** *avg by (table)
(pinot_server_freshnessLagMs_50thPercentile{namespace="pinot-qa"})* and same
for other percentiles  
 **@saumya2700:** avg by (table)
(pinot_server_freshnessLagMs_50thPercentile{namespace="pinot-qa"}) and same
for other percentiles  
 **@akashsaluja:** @akashsaluja has joined the channel  
 **@dnanassy:** @dnanassy has joined the channel  
 **@luys8611:** I'm struggling to add Segments to pinot-controller using this
command. `docker exec -it manual-pinot-controller bin/pinot-admin.sh
LaunchDataIngestionJob -jobSpecFile /data/docker-job-spec.yml -exec` But it
gives an error. ``` 2022/05/12 09:51:12.633 ERROR [PinotAdministrator] [main]
Exception caught: picocli.CommandLine$UnmatchedArgumentException: Unknown
option: '-exec'``` Anyone can help me with this?  
**@xiangfu0:** remove `-exec`?  
 **@saumya2700:** hi, I am struggling to update table config, I have updated
schema to add new column and for same column I have written
transformationFunction, in existing table it is showing the new column but not
copying the value in that new column, with same table config I created new
table and all is working fine. I have also reloaded the segments but not
working with the existing table, record's json string is -> ```{"header":
{"nnTransId": "9003", "qid": 1, "timestamp": 1234567890123}, "status":
"N200_SUCCESS"}``` ```"ingestionConfig": { "transformConfigs": [ {
"columnName": "header_js", "transformFunction": "jsonFormat(header)" }, {
"columnName": "header_nnTransId", "transformFunction": "JSONPATHSTRING(header,
'$.nnTransId')" } .....```  
**@mayanks:** Did you not do both at the same time? One issue I can see is
that adding transform function on existing column is considered backward
incompatible and is not allowed. If you created the column in schema first and
later trying to add transform function you might run into this issue  
**@saumya2700:** both at the same time means , I first created column in
schema today and just after that updated table schema. Without adding column
in schema it wont allow to update column in table config.  
**@mayanks:** Are you able to update the table config with transform, and
confirm that it is accepted?  
**@mayanks:** You can check by querying table config to see if your transform
function shows up  
**@saumya2700:** yes it says updated successfully  
**@saumya2700:** I can see new column also in table when querying, query is
also happening just value is not coming in that column always null, no errors
in logs. With same config when I created new table value is coming up, so no
issue in transformation function also.  
**@mayanks:** Hmm that doesn’t make sense. If table config shows transform
function when you check in Pinot UI, then it should be applied  
**@kharekartik:** Hi, so were there any records in the table before the schema
and table config were updated? If yes, can you check if the query result
contains older records or new records  
**@filipdolinski:** @filipdolinski has joined the channel  
 **@luisfernandez:** hello my friends it’s me again, does anyone know what
would be the reason of zookeeper crashing while we are ingesting data with the
job yaml? we are running some migrations and it seems like zookeeper is just
keeping on being sad crashing a lot, also what’s your recommended sizing for
zookeeper, we are just using the default in the helm chart we may be hitting
some roof  
**@mayanks:** Too many segments (tens of thousands or more)?  
**@luisfernandez:** i see 2k segments after the migration was done but at the
end is def more than that  
**@luisfernandez:** how can i tell?  
**@mayanks:** Pinot UI should show  
**@mayanks:** 2k is small  
**@luisfernandez:** the thing is that i don’t know if all the segments that
were supposed to be migrated were migrated  
**@luisfernandez:** we have a 512mb with 256 heap zk  
**@luisfernandez:** 3 replicas  
**@mayanks:** @dlavoie any comments  
**@dlavoie:** What is the crash error?  
**@luisfernandez:** i’m trying to find that out :smile:  
**@luisfernandez:**  
**@dlavoie:** `kubectl logs pinot-zookeeper-1 --previous`  
**@luisfernandez:** ```2022-05-12 10:15:32,603 [myid:2] - INFO
[WorkerSender[myid=2]:QuorumCnxManager@430] - Have smaller server identifier,
so dropping the connection: (3, 2) 2022-05-12 10:15:32,604 [myid:2] - INFO
[WorkerReceiver[myid=2]:FastLeaderElection@679] - Notification: 2 (message
format version), 2 (n.leader), 0x0 (n.zxid), 0x1 (n.round), LOOKING (n.state),
2 (n.sid), 0x0 (n.peerEPoch), LOOKING (my state)0 (n.config version)
2022-05-12 10:15:32,614 [myid:2] - INFO
[WorkerReceiver[myid=2]:FastLeaderElection@679] - Notification: 2 (message
format version), 3 (n.leader), 0x110001d0cf (n.zxid), 0x6 (n.round), FOLLOWING
(n.state), 1 (n.sid), 0x12 (n.peerEPoch), LOOKING (my state)0 (n.config
version) 2022-05-12 10:15:32,615 [myid:2] - INFO
[WorkerReceiver[myid=2]:FastLeaderElection@679] - Notification: 2 (message
format version), 3 (n.leader), 0x110001d0cf (n.zxid), 0x6 (n.round), FOLLOWING
(n.state), 1 (n.sid), 0x12 (n.peerEPoch), LOOKING (my state)0 (n.config
version) 2022-05-12 10:15:32,816 [myid:2] - INFO
[QuorumPeer[myid=2](plain=/0.0.0.0:2181)(secure=disabled):FastLeaderElection@919]
- Notification time out: 400 2022-05-12 10:15:32,816 [myid:2] - INFO
[WorkerSender[myid=2]:QuorumCnxManager@430] - Have smaller server identifier,
so dropping the connection: (3, 2) 2022-05-12 10:15:32,817 [myid:2] - INFO
[WorkerReceiver[myid=2]:FastLeaderElection@679] - Notification: 2 (message
format version), 2 (n.leader), 0x0 (n.zxid), 0x1 (n.round), LOOKING (n.state),
2 (n.sid), 0x0 (n.peerEPoch), LOOKING (my state)0 (n.config version)
2022-05-12 10:15:32,817 [myid:2] - INFO
[WorkerReceiver[myid=2]:FastLeaderElection@679] - Notification: 2 (message
format version), 3 (n.leader), 0x110001d0cf (n.zxid), 0x6 (n.round), FOLLOWING
(n.state), 1 (n.sid), 0x12 (n.peerEPoch), LOOKING (my state)0 (n.config
version) 2022-05-12 10:15:33,217 [myid:2] - INFO
[QuorumPeer[myid=2](plain=/0.0.0.0:2181)(secure=disabled):FastLeaderElection@919]
- Notification time out: 800 2022-05-12 10:15:33,221 [myid:2] - INFO
[WorkerSender[myid=2]:QuorumCnxManager@430] - Have smaller server identifier,
so dropping the connection: (3, 2) 2022-05-12 10:15:33,221 [myid:2] - INFO
[WorkerReceiver[myid=2]:FastLeaderElection@679] - Notification: 2 (message
format version), 2 (n.leader), 0x0 (n.zxid), 0x1 (n.round), LOOKING (n.state),
2 (n.sid), 0x0 (n.peerEPoch), LOOKING (my state)0 (n.config version)
2022-05-12 10:15:33,221 [myid:2] - INFO
[WorkerReceiver[myid=2]:FastLeaderElection@679] - Notification: 2 (message
format version), 3 (n.leader), 0x110001d0cf (n.zxid), 0x6 (n.round), FOLLOWING
(n.state), 1 (n.sid), 0x12 (n.peerEPoch), LOOKING (my state)0 (n.config
version) 2022-05-12 10:15:34,022 [myid:2] - INFO
[QuorumPeer[myid=2](plain=/0.0.0.0:2181)(secure=disabled):FastLeaderElection@919]
- Notification time out: 1600 2022-05-12 10:15:34,023 [myid:2] - INFO
[WorkerSender[myid=2]:QuorumCnxManager@430] - Have smaller server identifier,
so dropping the connection: (3, 2) 2022-05-12 10:15:34,023 [myid:2] - INFO
[WorkerReceiver[myid=2]:FastLeaderElection@679] - Notification: 2 (message
format version), 2 (n.leader), 0x0 (n.zxid), 0x1 (n.round), LOOKING (n.state),
2 (n.sid), 0x0 (n.peerEPoch), LOOKING (my state)0 (n.config version)
2022-05-12 10:15:34,023 [myid:2] - INFO
[WorkerReceiver[myid=2]:FastLeaderElection@679] - Notification: 2 (message
format version), 3 (n.leader), 0x110001d0cf (n.zxid), 0x6 (n.round), FOLLOWING
(n.state), 1 (n.sid), 0x12 (n.peerEPoch), LOOKING (my state)0 (n.config
version) 2022-05-12 10:15:35,624 [myid:2] - INFO
[QuorumPeer[myid=2](plain=/0.0.0.0:2181)(secure=disabled):FastLeaderElection@919]
- Notification time out: 3200 2022-05-12 10:15:35,624 [myid:2] - INFO
[WorkerSender[myid=2]:QuorumCnxManager@430] - Have smaller server identifier,
so dropping the connection: (3, 2) 2022-05-12 10:15:35,624 [myid:2] - INFO
[WorkerReceiver[myid=2]:FastLeaderElection@679] - Notification: 2 (message
format version), 2 (n.leader), 0x0 (n.zxid), 0x1 (n.round), LOOKING (n.state),
2 (n.sid), 0x0 (n.peerEPoch), LOOKING (my state)0 (n.config version)
2022-05-12 10:15:35,625 [myid:2] - INFO
[WorkerReceiver[myid=2]:FastLeaderElection@679] - Notification: 2 (message
format version), 3 (n.leader), 0x110001d0cf (n.zxid), 0x6 (n.round), FOLLOWING
(n.state), 1 (n.sid), 0x12 (n.peerEPoch), LOOKING (my state)0 (n.config
version) 2022-05-12 10:15:38,825 [myid:2] - INFO
[QuorumPeer[myid=2](plain=/0.0.0.0:2181)(secure=disabled):FastLeaderElection@919]
- Notification time out: 6400 2022-05-12 10:15:38,826 [myid:2] - INFO
[WorkerSender[myid=2]:QuorumCnxManager@430] - Have smaller server identifier,
so dropping the connection: (3, 2) 2022-05-12 10:15:38,826 [myid:2] - INFO
[WorkerReceiver[myid=2]:FastLeaderElection@679] - Notification: 2 (message
format version), 2 (n.leader), 0x0 (n.zxid), 0x1 (n.round), LOOKING (n.state),
2 (n.sid), 0x0 (n.peerEPoch), LOOKING (my state)0 (n.config version)
2022-05-12 10:15:38,826 [myid:2] - INFO
[WorkerReceiver[myid=2]:FastLeaderElection@679] - Notification: 2 (message
format version), 3 (n.leader), 0x110001d0cf (n.zxid), 0x6 (n.round), FOLLOWING
(n.state), 1 (n.sid), 0x12 (n.peerEPoch), LOOKING (my state)0 (n.config
version) 2022-05-12 10:15:45,227 [myid:2] - INFO
[QuorumPeer[myid=2](plain=/0.0.0.0:2181)(secure=disabled):FastLeaderElection@919]
- Notification time out: 12800 2022-05-12 10:15:45,228 [myid:2] - INFO
[WorkerSender[myid=2]:QuorumCnxManager@430] - Have smaller server identifier,
so dropping the connection: (3, 2) 2022-05-12 10:15:45,228 [myid:2] - INFO
[WorkerReceiver[myid=2]:FastLeaderElection@679] - Notification: 2 (message
format version), 2 (n.leader), 0x0 (n.zxid), 0x1 (n.round), LOOKING (n.state),
2 (n.sid), 0x0 (n.peerEPoch), LOOKING (my state)0 (n.config version)
2022-05-12 10:15:45,233 [myid:2] - INFO
[WorkerReceiver[myid=2]:FastLeaderElection@679] - Notification: 2 (message
format version), 3 (n.leader), 0x110001d0cf (n.zxid), 0x6 (n.round), FOLLOWING
(n.state), 1 (n.sid), 0x12 (n.peerEPoch), LOOKING (my state)0 (n.config
version) 2022-05-12 10:15:56,491 [myid:2] - INFO
[NIOWorkerThread-1:FourLetterCommands@234] - The list of known four letter
word commands is : [{1936881266=srvr, 1937006964=stat, 2003003491=wchc,
1685417328=dump, 1668445044=crst, 1936880500=srst, 1701738089=envi,
1668247142=conf, -720899=telnet close, 2003003507=wchs, 2003003504=wchp,
1684632179=dirs, 1668247155=cons, 1835955314=mntr, 1769173615=isro,
1920298859=ruok, 1735683435=gtmk, 1937010027=stmk}] 2022-05-12 10:15:56,491
[myid:2] - INFO [NIOWorkerThread-1:FourLetterCommands@235] - The list of
enabled four letter word commands is : [[wchs, stat, wchp, dirs, stmk, conf,
ruok, mntr, srvr, wchc, envi, srst, isro, dump, gtmk, telnet close, crst,
cons]] 2022-05-12 10:15:56,491 [myid:2] - INFO
[NIOWorkerThread-1:NIOServerCnxn@518] - Processing ruok command from
/127.0.0.1:56218 2022-05-12 10:15:57,298 [myid:2] - INFO
[NIOWorkerThread-2:NIOServerCnxn@518] - Processing srvr command from
/127.0.0.1:56224```  
**@dlavoie:** I see no shutdown command.  
**@dlavoie:** `kubectl describe pod pinot-zookeeper-1` Should give an exit
status for previous container  
**@luisfernandez:** ``` Last State: Terminated Reason: Error Exit Code: 143```  
**@dlavoie:** We get this error when the liveness probe stops returning exit
code 0  
**@mayanks:** Perhaps too low heap is a potential issue here?  
**@dlavoie:** For 2K segments, that’s a a reasonable assumption  
**@dlavoie:** But I would expect some OOM Killed or JVM OOM stacktrace  
**@luisfernandez:** ops people did tell me that those pods seem to be using
mostly of the memory available  
**@dlavoie:** Do you have any trace of a OOMKilled in the pod describe output?  
**@luisfernandez:** like w this one? `kubectl describe pod pinot-zookeeper-1`  
**@dlavoie:** yes  
**@dlavoie:** `kubectl describe pod pinot-zookeeper-1 | grep OOM`  
**@luisfernandez:** nada  
**@dlavoie:** Ok, then maybe ZK gracefully reports error status on mntr
command when memory is running out. So, I would recommend bumping memory  
**@luisfernandez:** this is our config  
**@luisfernandez:** ```resources: { requests: { cpu: '500m', memory: '1Gi', },
limits: { cpu: '500m', memory: '1Gi', }, },```  
**@luisfernandez:** ```ZK_HEAP_SIZE: '256M',```  
**@dlavoie:** I would 2x or 4x all of these values  
**@luisfernandez:** AHA  
**@luisfernandez:** ```2022-05-06 15:55:56,927 [myid:3] - WARN
[LearnerHandler-/10.24.10.144:59094:LearnerHandler@928] - Ignoring unexpected
exception java.lang.InterruptedException at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1220)
at
java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:335)
at java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:339)
at
org.apache.zookeeper.server.quorum.LearnerHandler.shutdown(LearnerHandler.java:926)
at
org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:647)
2022-05-06 15:55:56,927 [myid:3] - WARN
[QuorumPeer[myid=3](plain=/0.0.0.0:2181)(secure=disabled):ZooKeeperThread@55]
- Exception occurred from thread
QuorumPeer[myid=3](plain=/0.0.0.0:2181)(secure=disabled)
java.lang.OutOfMemoryError: Java heap space 2022-05-06 15:55:56,927 [myid:3] -
WARN [LearnerHandler-/10.24.7.181:38132:LearnerHandler@928] - Ignoring
unexpected exception java.lang.InterruptedException at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1220)
at
java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:335)
at java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:339)
at
org.apache.zookeeper.server.quorum.LearnerHandler.shutdown(LearnerHandler.java:926)
at
org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:647)
2022-05-06 15:55:56,928 [myid:3] - INFO [main:QuorumPeerMain@104] - Exiting
normally```  
**@luisfernandez:** in `pinot-zookeeper-2`  
**@dlavoie:** there you go :slightly_smiling_face: Smoking gun  
**@luisfernandez:** cpu also you recommend 2x or 4x?  
**@dlavoie:** 2x is going to be fine, but my recommendation would be to
instrument and measure resource usage  
**@dlavoie:** and adjust  
**@luisfernandez:** :pray: will do so  
**@luisfernandez:** now, in terms of the job spec yaml, if zookeeper dies what
are the implications for the job  
**@luisfernandez:** i do see a lot of exceptions in the controller where we
ran the job  
**@dlavoie:** @mayanks ^^  
**@luisfernandez:** and i imagine that zookeeper dying would just take all the
controllers to a sad state and then mess up with the import  
**@mayanks:** Yes, exceptions are probably effect rather than cause  
**@luisfernandez:** yea, so we have beefed up those machines  
**@luisfernandez:** also another question, is there anything we have to do to
the helm config in order for us to get grafana metrics for zk?  
**@dlavoie:** Yeah current helm chart is broken  
**@luisfernandez:** it seems like that future is also recent  
**@dlavoie:** latest ZK embedds a prometheus  
**@dlavoie:** We implemented it in our Cloud operator. You only meed to add a
ZK properties. Chart needs that default properties and the removal of the side
car containers  
**@dlavoie:** If you are interested in contributing to the project, that’s a
nice one to look at.  
 **@alihaydar.atil:** Hello everyone, Is it possible to generate fixed segment
names with sequenceId appended to them? I have an input folder with multiple
csv files in it. I want to run an ingestion job to import them. I am trying to
use backfill data feature to truncate my table. I want to replace those
segments with another data set in the future and data doesn't have a time
column actually. It seems that segment names are playing a role in replacing
segments. That's why i am asking about fixed name plus sequenceId. Thanks in
advance :pray:  
**@mayanks:** I think setting the table as REFRESH table will do that  
**@alihaydar.atil:** Thank you @mayanks it did the work. Would it be possible
to do this on a table with timeColumn? If I wanted to discard all the old
segments and import fresh data.  
**@mayanks:** Hmm do you want to do it on a regular basis or once in a while ?  
**@mayanks:** For append table, you can still achieve, if your input folders
are date partitioned. Then segment name for a day will be deterministic  
**@alihaydar.atil:** it would be nice to do it on a regular basis. I would
like to keep daily data in my table.  
 **@stuart.millholland:** Hi. I'm trying to setup gcs as the data bucket for
our pinot controller in our gke dev environments only. I've set things up in
the extra.configs in the controller section and I'm getting this error: Local
temporary directory is not configured, cannot use remote data directory  
**@stuart.millholland:** Ah I may have answered my own question, Looks like my
extra configs were overwritten  
**@mayanks:** Also for reference:  
**@stuart.millholland:** Thanks!  
 **@harshininair14:** @harshininair14 has joined the channel  

###  _#pinot-dev_

  
 **@noon:** @noon has joined the channel  

###  _#getting-started_

  
 **@akashsaluja:** @akashsaluja has joined the channel  
 **@dnanassy:** @dnanassy has joined the channel  
 **@filipdolinski:** @filipdolinski has joined the channel  
 **@madhumitamantri:** @madhumitamantri has joined the channel  
 **@harshininair14:** @harshininair14 has joined the channel  

###  _#jobs_

  
 **@madhumitamantri:** @madhumitamantri has joined the channel  

###  _#introductions_

  
 **@akashsaluja:** @akashsaluja has joined the channel  
 **@dnanassy:** @dnanassy has joined the channel  
 **@filipdolinski:** @filipdolinski has joined the channel  
 **@madhumitamantri:** @madhumitamantri has joined the channel  
 **@harshininair14:** @harshininair14 has joined the channel  

###  _#linen_dev_

  
 **@kam:** @xiangfu0 btw if you have the export history I can upload it to
Linen to have the conversations show up  
 **@xiangfu0:** sure, let me retry this first  
\--------------------------------------------------------------------- To
unsubscribe, e-mail: dev-unsubscribe@pinot.apache.org For additional commands,
e-mail: dev-help@pinot.apache.org