You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pinot.apache.org by Pinot Slack Email Digest <ap...@gmail.com> on 2021/09/24 02:00:17 UTC

Apache Pinot Daily Email Digest (2021-09-23)

### _#general_

  
 **@nageshblore:** @nageshblore has joined the channel  
 **@ken:** We have about 1500 segments in our HDFS deep store directory. We
push these to our Pinot cluster via a metadata push, so only the URI is sent
to the controller, which works well. But when we add a single new segment, our
push job still has to download/untar all 1500 segments, because we can’t
specify a pattern to filter the output directory files to only the new file.
We could add per-month subdirectories in HDFS to restrict the number of files
being processed this way, but is there a better approach? Note that the files
in HDFS can’t be moved around, as their deep store URIs are part of the
Zookeeper state.  
**@mayanks:** Perhaps we can support file name / pattern to select?  
**@ken:** That was one thought I had, yes.  
**@ken:** Essentially there’s an implicit pattern currently, as the filename
has to end with “.tar.gz”  
**@mayanks:** Yeah, should be easy to enhance.  
**@g.kishore:** per month and per day sub-directory is not a bad idea.  
 **@gsisodiya:** @gsisodiya has joined the channel  
 **@sheetalarun.kadam2:** I have a everyday load to Pinot tables and I am
planning to create a segment everyday. Problem is on some days I expect to
have very low records so the resulting segment will be too small and will
affect performance in longer run. One solution is to do periodic segment
merges. Any suggestions?  
**@mayanks:** What's your retention? If you have only one segment per day,
then there isn't much number of segments to have any severe performance
degradation.  
**@sheetalarun.kadam2:** Yes its 1 segment per day. But I heard from some
other people that having smaller segments affected the performance. There
might be days (even weeks) when the segment will be in KBs (<10 records)  

###  _#random_

  
 **@nageshblore:** @nageshblore has joined the channel  
 **@gsisodiya:** @gsisodiya has joined the channel  

###  _#troubleshooting_

  
 **@gabuglc:** Hello guys, any ideas why pinot isnt reading all messages from
my kafka topic? According to the logs, it has reached the 3815366 offset ,
which is the amount of messages I have. However, when I query them on pinot, i
got only 3794657 ```pinot-server_1 | Consumed 0 events from (rate:0.0/s),
currentOffset=3815366, numRowsConsumedSoFar=590061,
numRowsIndexedSoFar=577470```  
**@mayanks:** Offsets are not necessarily contiguous, right? If so, how did
you ensure the number of messages?  
**@gabuglc:** im counting them in ksqldb  
**@mayanks:** Ok. Can you check the logs for any messages that were dropped
due to schema mismatch etc  
**@mayanks:** And also is this a realtime only table or a hybrid table?  
**@gabuglc:** realtime  
**@gabuglc:** got it ` ```2021/09/23 02:57:17.146 ERROR
[LLRealtimeSegmentDataManager_crs__0__7__20210923T0256Z]
[crs__0__7__20210923T0256Z] Caught exception while transforming the record: {
"fieldToValueMap" : { "TS_MS" : 1632087338018, "CURRENCY_ID" : 1, "SUCCESS" :
"true", "AMOUNT_AUTHORIZED" : -168.0, "EXCHANGE_RATE" : 3.713, "HA__ID" :
90163974, "SERVICE_MODEL_ID" : 1, "STATE_T_T" : 1616475600000, "V_CRS__ID" :
24, "PR__ID" : 1, "HA_T" : "2021-03-23T15:54:31.756192Z", "STATE_T" :
"2021-03-23T05:00:00Z", "V_LOG_CR__ID" : 21828011, "ST__ID" : 2820,
"PAYMENT_NET" : -168.0, "AMOUNT" : -168.0, "PG_CLASS" : 1, "AUTO_APTURE" :
null, "HA_T_T" : null }, "nullValueFields" : [ ] }
java.lang.IllegalStateException: Caught exception while invoking method:
public static long
org.apache.pinot.common.function.scalar.DateTimeFunctions.fromDateTime(java.lang.String,java.lang.String)
with arguments: [2021-03-23T15:54:31.756192Z, yyyy-MM-dd'T'HH:mm:ssZ] at
org.apache.pinot.common.function.FunctionInvoker.invoke(FunctionInvoker.java:131)
~[pinot-all-0.9.0-SNAPSHOT-jar-with-
dependencies.jar:0.9.0-SNAPSHOT-11f8550b9b2881ede4d105416ed970a5dd708463] at
org.apache.pinot.segment.local.function.InbuiltFunctionEvaluator$FunctionExecutionNode.execute(InbuiltFunctionEvaluator.java:120)
~[pinot-all-0.9.0-SNAPSHOT-jar-with-
dependencies.jar:0.9.0-SNAPSHOT-11f8550b9b2881ede4d105416ed970a5dd708463] at
org.apache.pinot.segment.local.function.InbuiltFunctionEvaluator.evaluate(InbuiltFunctionEvaluator.java:87)
~[pinot-all-0.9.0-SNAPSHOT-jar-with-
dependencies.jar:0.9.0-SNAPSHOT-11f8550b9b2881ede4d105416ed970a5dd708463] at
org.apache.pinot.segment.local.recordtransformer.ExpressionTransformer.transform(ExpressionTransformer.java:96)
~[pinot-all-0.9.0-SNAPSHOT-jar-with-
dependencies.jar:0.9.0-SNAPSHOT-11f8550b9b2881ede4d105416ed970a5dd708463] at
org.apache.pinot.segment.local.recordtransformer.CompositeTransformer.transform(CompositeTransformer.java:83)
~[pinot-all-0.9.0-SNAPSHOT-jar-with-
dependencies.jar:0.9.0-SNAPSHOT-11f8550b9b2881ede4d105416ed970a5dd708463] at
org.apache.pinot.core.data.manager.realtime.LLRealtimeSegmentDataManager.processStreamEvents(LLRealtimeSegmentDataManager.java:514)
[pinot-all-0.9.0-SNAPSHOT-jar-with-
dependencies.jar:0.9.0-SNAPSHOT-11f8550b9b2881ede4d105416ed970a5dd708463] at
org.apache.pinot.core.data.manager.realtime.LLRealtimeSegmentDataManager.consumeLoop(LLRealtimeSegmentDataManager.java:417)
[pinot-all-0.9.0-SNAPSHOT-jar-with-
dependencies.jar:0.9.0-SNAPSHOT-11f8550b9b2881ede4d105416ed970a5dd708463] at
org.apache.pinot.core.data.manager.realtime.LLRealtimeSegmentDataManager$PartitionConsumer.run(LLRealtimeSegmentDataManager.java:564)
[pinot-all-0.9.0-SNAPSHOT-jar-with-
dependencies.jar:0.9.0-SNAPSHOT-11f8550b9b2881ede4d105416ed970a5dd708463] at
java.lang.Thread.run(Thread.java:829) [?:?] Caused by:
java.lang.reflect.InvocationTargetException at
jdk.internal.reflect.GeneratedMethodAccessor34.invoke(Unknown Source) ~[?:?]
at
jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
~[?:?] at java.lang.reflect.Method.invoke(Method.java:566) ~[?:?] at
org.apache.pinot.common.function.FunctionInvoker.invoke(FunctionInvoker.java:128)
~[pinot-all-0.9.0-SNAPSHOT-jar-with-
dependencies.jar:0.9.0-SNAPSHOT-11f8550b9b2881ede4d105416ed970a5dd708463] ...
8 more Caused by: java.lang.IllegalArgumentException: Invalid format:
"2021-03-23T15:54:31.756192Z" is malformed at ".756192Z" at
org.joda.time.format.DateTimeParserBucket.doParseMillis(DateTimeParserBucket.java:187)
~[pinot-all-0.9.0-SNAPSHOT-jar-with-
dependencies.jar:0.9.0-SNAPSHOT-11f8550b9b2881ede4d105416ed970a5dd708463] at
org.joda.time.format.DateTimeFormatter.parseMillis(DateTimeFormatter.java:826)
~[pinot-all-0.9.0-SNAPSHOT-jar-with-
dependencies.jar:0.9.0-SNAPSHOT-11f8550b9b2881ede4d105416ed970a5dd708463] at
org.apache.pinot.common.function.DateTimePatternHandler.parseDateTimeStringToEpochMillis(DateTimePatternHandler.java:37)
~[pinot-all-0.9.0-SNAPSHOT-jar-with-
dependencies.jar:0.9.0-SNAPSHOT-11f8550b9b2881ede4d105416ed970a5dd708463] at
org.apache.pinot.common.function.scalar.DateTimeFunctions.fromDateTime(DateTimeFunctions.java:264)
~[pinot-all-0.9.0-SNAPSHOT-jar-with-
dependencies.jar:0.9.0-SNAPSHOT-11f8550b9b2881ede4d105416ed970a5dd708463] at
jdk.internal.reflect.GeneratedMethodAccessor34.invoke(Unknown Source) ~[?:?]
at
jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
~[?:?] at java.lang.reflect.Method.invoke(Method.java:566) ~[?:?] at
org.apache.pinot.common.function.FunctionInvoker.invoke(FunctionInvoker.java:128)
~[pinot-all-0.9.0-SNAPSHOT-jar-with-
dependencies.jar:0.9.0-SNAPSHOT-11f8550b9b2881ede4d105416ed970a5dd708463]```  
**@gabuglc:** apparently some records are with another time format  
**@mayanks:** Yeah that would explain the missing records  
**@nageshblore:** @nageshblore has joined the channel  
 **@prashant.pandey:** Hi guys, we faced a sudden ingestion rate drop to 0
some time back. These are our observations (have attached the log files).
Realtime-server logs: ```java.lang.InternalError: a fault occurred in a recent
unsafe memory access operation in compiled Java code 15 at
org.apache.pinot.core.realtime.impl.dictionary.BaseOffHeapMutableDictionary.indexValue(BaseOffHeapMutableDictionary.java:361)
14 at
org.apache.pinot.core.realtime.impl.dictionary.StringOffHeapMutableDictionary.index(StringOffHeapMutableDictionary.java:61)
13 at
org.apache.pinot.core.indexsegment.mutable.MutableSegmentImpl.updateDictionary(MutableSegmentImpl.java:521)
12 at
org.apache.pinot.core.indexsegment.mutable.MutableSegmentImpl.index(MutableSegmentImpl.java:466)
11 at
org.apache.pinot.core.data.manager.realtime.LLRealtimeSegmentDataManager.processStreamEvents(LLRealtimeSegmentDataManager.java:497)
10 at
org.apache.pinot.core.data.manager.realtime.LLRealtimeSegmentDataManager.consumeLoop(LLRealtimeSegmentDataManager.java:402)
9 at
org.apache.pinot.core.data.manager.realtime.LLRealtimeSegmentDataManager$PartitionConsumer.run(LLRealtimeSegmentDataManager.java:538)
8 at java.lang.Thread.run(Thread.java:748) 7 2021/09/23 15:00:16.067 INFO
[HelixServerStarter] [Start a Pinot [SERVER]] Sleep for 10000ms as service
status has not turned GOOD:
PinotServiceManagerStatusCallback:Started;MultipleCallbackServiceSta
tusCallback:IdealStateAndCurrentStateMatchServiceStatusCallback:partition=rawServiceView__0__3342__20210922T1809Z,
expected=ONLINE, found=OFFLINE, creationTime=1632409068648,
modifiedTime=163240921557 7, version=11, waitingFor=CurrentStateMatch,
resource=rawServiceView_REALTIME, numResourcesLeft=2, numTotalResources=5,
minStartCount=5,;IdealStateAndExternalViewMatchServiceStatusCallback:Init;; 6
2021/09/23 15:00:26.077 INFO [HelixServerStarter] [Start a Pinot [SERVER]]
Sleep for 10000ms as service status has not turned GOOD:
PinotServiceManagerStatusCallback:Started;MultipleCallbackServiceSta
tusCallback:IdealStateAndCurrentStateMatchServiceStatusCallback:partition=backendEntityView__2__23025__20210922T1909Z,
expected=ONLINE, found=OFFLINE, creationTime=1632409068635,
modifiedTime=16324091 07080, version=19, waitingFor=CurrentStateMatch,
resource=backendEntityView_REALTIME, numResourcesLeft=2, numTotalResources=5,
minStartCount=5,;IdealStateAndExternalViewMatchServiceStatusCallback:Init ;; 5
2021/09/23 15:00:36.093 INFO [HelixServerStarter] [Start a Pinot [SERVER]]
Sleep for 10000ms as service status has not turned GOOD:
PinotServiceManagerStatusCallback:Started;MultipleCallbackServiceSta
tusCallback:IdealStateAndCurrentStateMatchServiceStatusCallback:partition=backendEntityView__2__23025__20210922T1909Z,
expected=ONLINE, found=OFFLINE, creationTime=1632409068635,
modifiedTime=16324091 07080, version=19, waitingFor=CurrentStateMatch,
resource=backendEntityView_REALTIME, numResourcesLeft=1, numTotalResources=5,
minStartCount=5,;IdealStateAndExternalViewMatchServiceStatusCallback:Init ;; 4
2021/09/23 15:00:46.103 INFO [HelixServerStarter] [Start a Pinot [SERVER]]
Sleep for 10000ms as service status has not turned GOOD:
PinotServiceManagerStatusCallback:Started;MultipleCallbackServiceSta
tusCallback:IdealStateAndCurrentStateMatchServiceStatusCallback:partition=backendEntityView__2__23025__20210922T1909Z,
expected=ONLINE, found=OFFLINE, creationTime=1632409068635,
modifiedTime=16324091 07080, version=19, waitingFor=CurrentStateMatch,
resource=backendEntityView_REALTIME, numResourcesLeft=1, numTotalResources=5,
minStartCount=5,;IdealStateAndExternalViewMatchServiceStatusCallback:Init ;; 3
2021/09/23 15:00:56.113 INFO [HelixServerStarter] [Start a Pinot [SERVER]]
Sleep for 10000ms as service status has not turned GOOD:
PinotServiceManagerStatusCallback:Started;MultipleCallbackServiceSta
tusCallback:IdealStateAndCurrentStateMatchServiceStatusCallback:partition=backendEntityView__2__23025__20210922T1909Z,
expected=ONLINE, found=OFFLINE, creationTime=1632409068635,
modifiedTime=16324092 55125, version=22, waitingFor=CurrentStateMatch,
resource=backendEntityView_REALTIME, numResourcesLeft=1, numTotalResources=5,
minStartCount=5,;IdealStateAndExternalViewMatchServiceStatusCallback:Init ;; 2
2021/09/23 15:01:06.122 INFO [HelixServerStarter] [Start a Pinot [SERVER]]
Sleep for 10000ms as service status has not turned GOOD:
PinotServiceManagerStatusCallback:Started;MultipleCallbackServiceSta
tusCallback:IdealStateAndCurrentStateMatchServiceStatusCallback:partition=backendEntityView__2__23040__20210923T0619Z,
expected=ONLINE, found=OFFLINE, creationTime=1632409068635,
modifiedTime=16324092 66062, version=48, waitingFor=CurrentStateMatch,
resource=backendEntityView_REALTIME, numResourcesLeft=1, numTotalResources=5,
minStartCount=5,;IdealStateAndExternalViewMatchServiceStatusCallback:Init ;; 1
2021/09/23 15:01:16.215 INFO [HelixServerStarter] [Start a Pinot [SERVER]]
Service status is GOOD after 190391ms 0 2021/09/23 15:01:16.222 INFO
[HelixServerStarter] [Start a Pinot [SERVER]] Pinot server ready 1 2021/09/23
15:30:06.057 WARN [ZkBaseDataAccessor] [ZkClient-EventThread-23-apache-pinot-
zookeeper-bitnami-headless.pinot.svc.cluster.local:2181] Fail to read record
for paths: {/pinot-prod/INSTANCES/S erver_server-realtime-0.server-realtime-
headless.pinot.svc.cluster.local_8098/MESSAGES/74dc1531-5db0-47a4-b97c-95cabc4067dc=-101}```  
**@prashant.pandey:** These are the GC and other metrics. The ingestion
suddenly dropped to 0. You can also see the CPU going to 0 on all real-time
servers in that time window.  
**@prashant.pandey:**  
**@prashant.pandey:** The ingestion rate went back to normal once we restarted
the realtime server pods.  
**@prashant.pandey:** Two of the real-time server pods had restarted 14 and 16
times respectively. We verified that there was no resource crunch for the
servers. There was also no ingestion rate spike at the time of the issue,
although the rate had increased a bit before the incident.  
**@prashant.pandey:** A similar issue happened around 6 days back:  
**@g.kishore:** thanks for sharing the logs  
**@g.kishore:** can you please create an issue and attach the logs  
**@mayanks:** Is there an hs_error.log, or do you see OOM anywhere?  
**@prashant.pandey:** There was no OOM definitely. Let me look for
hs_error.log though.  
**@prashant.pandey:** I’ll create an issue with all of this info.  
**@prashant.pandey:** Issue:  
**@mayanks:** If you are using Pinot 0.8.0, also run the debug endpoint api -  
**@subin.tp:** This is happening again on our system Attaching latest logs
here.  
 **@gsisodiya:** @gsisodiya has joined the channel  

###  _#pinot-docsrus_

  
 **@tisantos:** @tisantos has joined the channel  
\--------------------------------------------------------------------- To
unsubscribe, e-mail: dev-unsubscribe@pinot.apache.org For additional commands,
e-mail: dev-help@pinot.apache.org