You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pinot.apache.org by Pinot Slack Email Digest <sn...@apache.org> on 2020/09/02 02:00:08 UTC

Apache Pinot Daily Email Digest (2020-09-01)

<h3><u>#general</u></h3><br><strong>@sharmavedang07: </strong>@sharmavedang07 has joined the channel<br><strong>@bobli.usc: </strong>@bobli.usc has joined the channel<br><strong>@karinwolok1: </strong>@karinwolok1 has joined the channel<br><strong>@sriyansh.cse: </strong>@sriyansh.cse has joined the channel<br><h3><u>#random</u></h3><br><strong>@sharmavedang07: </strong>@sharmavedang07 has joined the channel<br><strong>@bobli.usc: </strong>@bobli.usc has joined the channel<br><strong>@karinwolok1: </strong>@karinwolok1 has joined the channel<br><strong>@sriyansh.cse: </strong>@sriyansh.cse has joined the channel<br><h3><u>#sql-rollout-plan</u></h3><br><strong>@tim780: </strong>@tim780 has joined the channel<br><h3><u>#minion-star-tree</u></h3><br><strong>@mailtobuchi: </strong>okay. thanks. I just got this idea and wanted to explore. Seems like it’s possible. We may or may not implement immediately but will definitely ping here if we’re proceeding with impl. appreciate the help<br><strong>@laxman: </strong>@laxman has joined the channel<br><h3><u>#troubleshooting</u></h3><br><strong>@sjeetsingh2801: </strong>@sjeetsingh2801 has joined the channel<br><strong>@yash.agarwal: </strong>Hey Team, I am getting intermittent exceptions in CombinePlanNode.
```Exception processing requestId 137
java.lang.RuntimeException: Caught exception while running CombinePlanNode.
	at org.apache.pinot.core.plan.CombinePlanNode.run(CombinePlanNode.java:149) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-701ffcbd5be5f39e91cea9a0297c4e8b0a7d9343]
	at org.apache.pinot.core.plan.InstanceResponsePlanNode.run(InstanceResponsePlanNode.java:33) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-701ffcbd5be5f39e91cea9a0297c4e8b0a7d9343]
	at org.apache.pinot.core.plan.GlobalPlanImplV0.execute(GlobalPlanImplV0.java:45) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-701ffcbd5be5f39e91cea9a0297c4e8b0a7d9343]
	at org.apache.pinot.core.query.executor.ServerQueryExecutorV1Impl.processQuery(ServerQueryExecutorV1Impl.java:221) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-701ffcbd5be5f39e91cea9a0297c4e8b0a7d9343]
	at org.apache.pinot.core.query.scheduler.QueryScheduler.processQueryAndSerialize(QueryScheduler.java:155) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-701ffcbd5be5f39e91cea9a0297c4e8b0a7d9343]
	at org.apache.pinot.core.query.scheduler.QueryScheduler.lambda$createQueryFutureTask$0(QueryScheduler.java:139) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-701ffcbd5be5f39e91cea9a0297c4e8b0a7d9343]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_265]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_265]
	at shaded.com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:111) [pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-701ffcbd5be5f39e91cea9a0297c4e8b0a7d9343]
	at shaded.com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:58) [pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-701ffcbd5be5f39e91cea9a0297c4e8b0a7d9343]
	at shaded.com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:75) [pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-701ffcbd5be5f39e91cea9a0297c4e8b0a7d9343]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_265]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_265]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_265]
Caused by: java.util.concurrent.TimeoutException
	at java.util.concurrent.FutureTask.get(FutureTask.java:205) ~[?:1.8.0_265]
	at org.apache.pinot.core.plan.CombinePlanNode.run(CombinePlanNode.java:139) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-701ffcbd5be5f39e91cea9a0297c4e8b0a7d9343]
	... 13 more
Processed requestId=137,table=guestslslitm3years_OFFLINE,segments(queried/processed/matched/consuming)=1058/-1/-1/-1,schedulerWaitMs=0,reqDeserMs=4,totalExecMs=10659,resSerMs=0,totalTimeMs=10663,minConsumingFreshnessMs=-1,broker=Broker_10.59.100.47_8099,numDocsScanned=-1,scanInFilter=-1,scanPostFilter=-1,sched=fcfs
Slow query: request handler processing time: 10663, send response latency: 58, total time to handle request: 10721```
Is there a reason why this is happening ? Is there a way we can override the timeout of 10s in CombineNodePlan<br><strong>@yash.agarwal: </strong>What is the best approach to solve the same ? I am currently storing about 3 Billion rows (1000 segments) per table on a single data node. Should I rebalance it to more servers or add CPU/RAM to the same.<br><strong>@tim780: </strong>@tim780 has joined the channel<br><strong>@tim780: </strong>hey <#C011C9JHN7R|troubleshooting>  i have followed the getting started guide for running pinot in kubernetes. i was able to configure a table to ingest records from a kafka cluster.<br><strong>@tim780: </strong>i was able to ingest 125000 records<br><strong>@tim780: </strong>when i do a count(*) on the table the count never increases<br><strong>@tim780: </strong>i see the following in the controller log<br><strong>@tim780: </strong>```2020/09/01 22:22:31.308 WARN [LLCSegmentCompletionHandlers] [grizzly-http-server-1] Segment file: file:/var/pinot/controller/data/motion/motion__3__0__20200901T2214Z already exists. Replacing it with segment: motion__3__0__20200901T2214Z from instance: Server_pinot-server-1.pinot-server-headless.pinot-quickstart.svc.cluster.local_8098
2020/09/01 22:22:31.311 WARN [LLCSegmentCompletionHandlers] [grizzly-http-server-0] Segment file: file:/var/pinot/controller/data/motion/motion__6__0__20200901T2214Z already exists. Replacing it with segment: motion__6__0__20200901T2214Z from instance: Server_pinot-server-1.pinot-server-headless.pinot-quickstart.svc.cluster.local_8098
2020/09/01 22:22:31.320 ERROR [SegmentCompletionFSM_motion__3__0__20200901T2214Z] [grizzly-http-server-1] Caught exception while committing segment metadata for segment: motion__3__0__20200901T2214Z
java.lang.NullPointerException: null
        at org.apache.pinot.controller.helix.core.realtime.PinotLLCRealtimeSegmentManager.updateCommittingSegmentZKMetadata(PinotLLCRealtimeSegmentManager.java:507) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-86a01ff6da71e433a29f26db5c3a586fa57e7ea7]
        at org.apache.pinot.controller.helix.core.realtime.PinotLLCRealtimeSegmentManager.commitSegmentMetadataInternal(PinotLLCRealtimeSegmentManager.java:446) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-86a01ff6da71e433a29f26db5c3a586fa57e7ea7]
        at org.apache.pinot.controller.helix.core.realtime.PinotLLCRealtimeSegmentManager.commitSegmentMetadata(PinotLLCRealtimeSegmentManager.java:416) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-86a01ff6da71e433a29f26db5c3a586fa57e7ea7]
        at org.apache.pinot.controller.helix.core.realtime.SegmentCompletionManager$SegmentCompletionFSM.commitSegment(SegmentCompletionManager.java:1091) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-86a01ff6da71e433a29f26db5c3a586fa57e7ea7]
        at org.apache.pinot.controller.helix.core.realtime.SegmentCompletionManager$SegmentCompletionFSM.segmentCommitEnd(SegmentCompletionManager.java:656) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-86a01ff6da71e433a29f26db5c3a586fa57e7ea7]
        at org.apache.pinot.controller.helix.core.realtime.SegmentCompletionManager.segmentCommitEnd(SegmentCompletionManager.java:325) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-86a01ff6da71e433a29f26db5c3a586fa57e7ea7]
        at org.apache.pinot.controller.api.resources.LLCSegmentCompletionHandlers.segmentCommit(LLCSegmentCompletionHandlers.java:330) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-86a01ff6da71e433a29f26db5c3a586fa57e7ea7]
        at sun.reflect.GeneratedMethodAccessor42.invoke(Unknown Source) ~[?:?]
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_265]
        at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_265]
        at org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory.lambda$static$0(ResourceMethodInvocationHandlerFactory.java:52) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-86a01ff6da71e433a29f26db5c3a586fa57e7ea7]
        at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:124) [pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-86a01ff6da71e433a29f26db5c3a586fa57e7ea7]
        at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:167) [pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-86a01ff6da71e433a29f26db5c3a586fa57e7ea7]
        at org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$TypeOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:219) [pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-86a01ff6da71e433a29f26db5c3a586fa57e7ea7]
        at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:79) [pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-86a01ff6da71e433a29f26db5c3a586fa57e7ea7]
        at org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:469) [pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-86a01ff6da71e433a29f26db5c3a586fa57e7ea7]
        at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:391) [pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-86a01ff6da71e433a29f26db5c3a586fa57e7ea7]
        at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:80) [pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-86a01ff6da71e433a29f26db5c3a586fa57e7ea7]
        at org.glassfish.jersey.server.ServerRuntime$1.run(ServerRuntime.java:253) [pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-86a01ff6da71e433a29f26db5c3a586fa57e7ea7]
        at org.glassfish.jersey.internal.Errors$1.call(Errors.java:248) [pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-86a01ff6da71e433a29f26db5c3a586fa57e7ea7]
        at org.glassfish.jersey.internal.Errors$1.call(Errors.java:244) [pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-86a01ff6da71e433a29f26db5c3a586fa57e7ea7]
        at org.glassfish.jersey.internal.Errors.process(Errors.java:292) [pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-86a01ff6da71e433a29f26db5c3a586fa57e7ea7]
        at org.glassfish.jersey.internal.Errors.process(Errors.java:274) [pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-86a01ff6da71e433a29f26db5c3a586fa57e7ea7]
        at org.glassfish.jersey.internal.Errors.process(Errors.java:244) [pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-86a01ff6da71e433a29f26db5c3a586fa57e7ea7]
        at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:265) [pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-86a01ff6da71e433a29f26db5c3a586fa57e7ea7]
        at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:232) [pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-86a01ff6da71e433a29f26db5c3a586fa57e7ea7]
        at org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:679) [pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-86a01ff6da71e433a29f26db5c3a586fa57e7ea7]
        at org.glassfish.jersey.grizzly2.httpserver.GrizzlyHttpContainer.service(GrizzlyHttpContainer.java:353) [pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-86a01ff6da71e433a29f26db5c3a586fa57e7ea7]
        at org.glassfish.grizzly.http.server.HttpHandler$1.run(HttpHandler.java:200) [pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0^Cfa57e7ea7]
        at org.apache.pinot.controller.helix.core.realtime.PinotLLCRealtimeSegmentManager.commitSegmentMetadataInternal(PinotLLCRealtimeSegmentManager.java:446) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-86a01ff6da71e433a29f26db5c3a586fa57e7ea7]
        at org.apache.pinot.controller.helix.core.realtime.PinotLLCRealtimeSegmentManager.commitSegmentMetadata(PinotLLCRealtimeSegmentManager.java:416) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-86a01ff6da71e433a29f26db5c3a586fa57e7ea7]
        at org.apache.pinot.controller.helix.core.realtime.SegmentCompletionManager$SegmentCompletionFSM.commitSegment(SegmentCompletionManager.java:1091) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-86a01ff6da71e433a29f26db5c3a586fa57e7ea7]
        at org.apache.pinot.controller.helix.core.realtime.SegmentCompletionManager$SegmentCompletionFSM.segmentCommitEnd(SegmentCompletionManager.java:656) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-86a01ff6da71e433a29f26db5c3a586fa57e7ea7]```<br><strong>@tim780: </strong>why does it say segment file already exists?<br><strong>@tim780: </strong><br><strong>@tim780: </strong>the cluster manager ui says status good or consuming<br><strong>@jackie.jxt: </strong>Hi @tim780, based on the exception, I think the time column is not configured correctly<br><strong>@jackie.jxt: </strong>Can you please share the table config and the schema?<br><strong>@tim780: </strong>```{
    "schemaName":"motion",
    "dimensionFieldSpecs":[
       {
          "name":"sensor_id",
          "dataType":"STRING"
       },
       {
          "name":"config_details",
          "dataType":"STRING"
       },
       {
          "name":"has_motion",
          "dataType":"BOOLEAN"
       }
    ],
    "metricFieldSpecs":[
    ],
    "dateTimeFieldSpecs":[
       {
          "name":"time_stamp",
          "dataType":"LONG",
          "format":"1:MILLISECONDS:EPOCH",
          "granularity":"1:MILLISECONDS"
       }
    ]
}```<br><strong>@tim780: </strong>```{
    "tableName":"motion",
    "tableType":"REALTIME",
    "segmentsConfig":{
       "timeColumnName":"timestampInEpoch",
       "timeType":"MILLISECONDS",
       "schemaName":"motion",
       "replicasPerPartition":"1"
    },
    "tenants":{

    },
    "tableIndexConfig":{
       "loadMode":"MMAP",
       "streamConfigs":{
          "streamType":"kafka",
          "stream.kafka.consumer.type":"lowlevel",
          "stream.kafka.topic.name":"ash-logger.etldb.derived.motion",
          "stream.kafka.decoder.class.name":"org.apache.pinot.plugin.stream.kafka.KafkaJSONMessageDecoder",
          "stream.kafka.consumer.factory.class.name":"org.apache.pinot.plugin.stream.kafka20.KafkaConsumerFactory",
          "stream.kafka.broker.list":"kafka-dev-1-kafka-bootstrap.kafka-dev-1:9092",
          "realtime.segment.flush.threshold.time":"3600000",
          "realtime.segment.flush.threshold.size":"50000",
          "stream.kafka.consumer.prop.auto.offset.reset":"smallest"
       }
    },
    "metadata":{
       "customConfigs":{

       }
    }
 }```<br><strong>@tim780: </strong>an example value for `time_stamp` is `1599000470000`<br><strong>@npawar: </strong>table config says
```"segmentsConfig":{
       "timeColumnName":"timestampInEpoch",```<br><strong>@tim780: </strong>ooh i think i screwed up in `timeColumnName` for config<br><strong>@tim780: </strong>i feel dumb<br><strong>@tim780: </strong>i just copy pasted this config from the example<br><strong>@tim780: </strong>sorry to bother; i will correct and retry and report back<br><strong>@npawar: </strong>no worries! lmk how it goes<br><strong>@tim780: </strong>quick question. i used `pinot-admin.sh` to create this table. how would i go about dropping it and recreating it with the proper config?<br><strong>@npawar: </strong>you can drop using swagger APIs. if you go to localhost:9000 you should see this option<br><strong>@npawar: </strong>assuming 9000 is your controller port<br><strong>@tim780: </strong>i will try; thank you<br><strong>@tim780: </strong>thank you @npawar and @jackie.jxt<br><strong>@tim780: </strong>it works now<br><strong>@tim780: </strong>i do see some `2020/09/01 23:09:14.532 WARN [TopStateHandoffReportStage] [HelixController-pipeline-default-pinot-(d1b54062_DEFAULT)] Event d1b54062_DEFAULT : Cannot confirm top state missing start time. Use the current system time as the start time.`<br><strong>@tim780: </strong>in the controller logs<br><strong>@tim780: </strong>but at least i see count(*) updating<br><strong>@jackie.jxt: </strong>@tim780 This warning is from Helix, which you can ignore<br><strong>@tim780: </strong><br><strong>@tim780: </strong>i don’t know what to make of the segments marked bad<br><strong>@pradeepgv42: </strong>Hi, fyi with the latest master, swagger endpoint seems to be broken. Not sure if anyone noticed<br><h3><u>#pinot-k8s-operator</u></h3><br><strong>@tim780: </strong>@tim780 has joined the channel<br>