You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pinot.apache.org by Pinot Slack Email Digest <ap...@gmail.com> on 2022/02/25 02:00:24 UTC

Apache Pinot Daily Email Digest (2022-02-24)

### _#general_

  
 **@h20210119:** @h20210119 has joined the channel  
 **@alex:** @alex has joined the channel  
 **@ashish.athresh:** @ashish.athresh has joined the channel  
 **@abhishek.tanwade:** @abhishek.tanwade has joined the channel  
 **@sunhee.bigdata:** @sunhee.bigdata has joined the channel  

###  _#random_

  
 **@h20210119:** @h20210119 has joined the channel  
 **@alex:** @alex has joined the channel  
 **@ashish.athresh:** @ashish.athresh has joined the channel  
 **@abhishek.tanwade:** @abhishek.tanwade has joined the channel  
 **@sunhee.bigdata:** @sunhee.bigdata has joined the channel  

###  _#troubleshooting_

  
 **@h20210119:** @h20210119 has joined the channel  
 **@alex:** @alex has joined the channel  
 **@vibhor.jaiswal:** Hi All , We have been tring to do some Kafka Integration
for topics secured as SASL_PLAINTEXT . While doing this , we have been getting
the below exceptions . Just to double check I have craeted a Java client and
got that working and consuming messages . However Pinot is not able to consume
messages with pretty much same settings . Can someone suggest whats wrong here
? `2022/02/23 16:50:56.586 ERROR [PinotTableIdealStateBuilder] [grizzly-http-
server-0] Could not get PartitionGroupMetadata for topic:
gsp.dataacquisition.risk.public.v2.<Redacted> of table: <Redacted>_REALTIME`
`org.apache.kafka.common.errors.TimeoutException: Timeout expired while
fetching topic metadata` `2022/02/23 16:50:56.591 ERROR
[PinotTableRestletResource] [grizzly-http-server-0]
org.apache.kafka.common.errors.TimeoutException: Timeout expired while
fetching topic metadata` `java.lang.RuntimeException:
org.apache.kafka.common.errors.TimeoutException: Timeout expired while
fetching topic metadata` `at
org.apache.pinot.controller.helix.core.PinotTableIdealStateBuilder.getPartitionGroupMetadataList(PinotTableIdealStateBuilder.java:172)
~[pinot-all-0.10.0-SNAPSHOT-jar-with-
dependencies.jar:0.10.0-SNAPSHOT-428e7d75f91b9d4b4a2288f131d02d643bb2df5d]`
`at
org.apache.pinot.controller.helix.core.realtime.PinotLLCRealtimeSegmentManager.getNewPartitionGroupMetadataList(PinotLLCRealtimeSegmentManager.java:764)`
Below is the table config for reference - ```{ "tableName": "<Redacted>",
"tableType": "REALTIME", "segmentsConfig": { "schemaName": "<Redacted>",
"timeColumnName": "PublishDateTimeUTC", "allowNullTimeValue": false,
"replication": "1", "replicasPerPartition": "2", "completionConfig":{
"completionMode":"DOWNLOAD" } }, "tenants": { "broker": "DefaultTenant",
"server": "DefaultTenant", "tagOverrideConfig": {} }, "tableIndexConfig": {
"invertedIndexColumns": [], "noDictionaryColumns": ["some columns "],
"rangeIndexColumns": [], "rangeIndexVersion": 1, "autoGeneratedInvertedIndex":
false, "createInvertedIndexDuringSegmentGeneration": false, "sortedColumn":
[], "bloomFilterColumns": [], "loadMode": "MMAP", "streamConfigs": {
"streamType": "kafka", "stream.kafka.topic.name":
"gsp.dataacquisition.risk.public.v2.<Redacted>", "stream.kafka.broker.list":
"comma separated list of servers", "stream.kafka.consumer.type": "lowlevel",
"stream.kafka.consumer.prop.auto.offset.reset": "largest",
"stream.kafka.schema.registry.url": ,
"stream.kafka.consumer.factory.class.name":
"org.apache.pinot.plugin.stream.kafka20.KafkaConsumerFactory",
"stream.kafka.sasl.mechanism": "SCRAM-SHA-256" ,
"stream.kafka.security.protocol": "SASL_PLAINTEXT" ,
"stream.kafka.sasl.jaas.config":"org.apache.kafka.common.security.scram.ScramLoginModule
required username=\"some user\" password=\"somepwd\"",
"stream.kafka.decoder.class.name":
"org.apache.pinot.plugin.stream.kafka.KafkaJSONMessageDecoder",
"realtime.segment.flush.threshold.rows": "0",
"realtime.segment.flush.threshold.size":"0",
"realtime.segment.flush.threshold.time": "24h",
"realtime.segment.flush.autotune.initialRows": "3000000",
"realtime.segment.flush.threshold.segment.size": "500M" },
"onHeapDictionaryColumns": [], "varLengthDictionaryColumns": [],
"enableDefaultStarTree": false, "enableDynamicStarTreeCreation": false,
"aggregateMetrics": false, "nullHandlingEnabled": false }, "metadata": {},
"quota": {}, "routing": {"instanceSelectorType": "strictReplicaGroup"},
"query": {}, "ingestionConfig": {}, "isDimTable": false, "upsertConfig": {
"mode": "FULL", "comparisonColumn": "PublishDateTimeUTC" },
"primaryKeyColumns": [ "BusinessDate","UID","UIDType","LegId" ] }```  
**@mayanks:** I am guessing it is unable to connect to Kafka cc: @slack1
@npawar  
**@vibhor.jaiswal:** @mayanks @slack1 @npawar This issue was basically
integrating with the SSL . We fixed it by removing the stream.kafka prefix
from stream.kafka.sasl.mechanism ,stream.kafka.security.protocol and
stream.kafka.sasl.jaas.config . Please feel free to public this to the
documentation because we can not find the documentation anywhere about how to
integrate SSL secured Kafka with Pinot . Will be great value add for end
users.  
**@vibhor.jaiswal:** Another this we did here was added a semicolon in the end
of jaas.config like
-"sasl.jaas.config":"org.apache.kafka.common.security.scram.ScramLoginModule
required username=\"some user\" password=\"somepwd\";",  
**@mayanks:** Could paste a sample config here @vibhor.jaiswal? @mark.needham
we can then put it in our docs.  
**@slack1:** Thank you, @vibhor.jaiswal. We’ll add this to our docs  
 **@ashish.athresh:** @ashish.athresh has joined the channel  
 **@abhishek.tanwade:** @abhishek.tanwade has joined the channel  
 **@elon.azoulay:** qq, do you recommend setting
`controller.enable.batch.message.mode` to true? I see a github  from pinot
0.2.0 that switched it to false by default due to high controller gc. Do you
think it's safe to enable now? Pinot has evolved a lot since then.
:slightly_smiling_face:  
**@elon.azoulay:** context: we have a lot of tables with tiny segments, and we
are going through and fixing them, but in the meantime we notice a huge amount
of zk messages, mostly from broker resource and the tables w tiny segments.  
**@elon.azoulay:** If anything we can try and let you know how it goes, unless
you recommend not to even try it.  
**@elon.azoulay:** Sorry to bug you guys: @mayanks @jackie.jxt @xiangfu0 do
you think it's safe to enable batch mode on the controllers?  
**@elon.azoulay:** Whenever you have time lmk  
**@jackie.jxt:** This flag is actually passed to Helix, and I don't think we
have upgraded Helix since `0.2.0`. I'm not sure if the issue described in the
PR applies to you, so I would suggest enabling it in a staging cluster and
monitor how it works  
**@elon.azoulay:** nice, thanks! Will let you know how it goes.  
**@elon.azoulay:** Hopefully the findings will be helpful  
 **@sunhee.bigdata:** @sunhee.bigdata has joined the channel  
 **@sunhee.bigdata:** Hi, everyone :slightly_smiling_face: I am trying batch
ingestion job. There are 5 server instances(3 is running , 2 is down) in our
pinot cluster. When segments assigned, some segments assigned to down server
instance. Even when all segment are assigned to down server instances, I cant
use the table. Is is normal? or Any other solutions? Thank you
:slightly_smiling_face:  
**@mayanks:** What is your replication? If replication = 1, then server down =
data unavailability.  
**@mayanks:** Do you know why the servers are down?  
\--------------------------------------------------------------------- To
unsubscribe, e-mail: dev-unsubscribe@pinot.apache.org For additional commands,
e-mail: dev-help@pinot.apache.org