You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2021/01/12 14:41:52 UTC

[GitHub] [incubator-pinot] alexandervivas opened a new issue #6433: Consumption from external kafka instances in third party services

alexandervivas opened a new issue #6433:
URL: https://github.com/apache/incubator-pinot/issues/6433


   Hello guys.
   
   In our company we already have kafka as a service hosted by a third party service provider and thus our pinot cluster setup can only connect under the following constraints:
   
   - We don't have access to the third party kafka zookeeper's address.
   - Our connection can only be setup using ssl certificates and basic auth.
   
   At first we had some troubles making pinot to connect to our kafka topics as we already stated in #6123, but we found our way through that by implementing something on our side.
   
   Now what we're facing at the moment for we require some assistance is, when we create a realtime table consuming the stream from one of those kafka topics, we see for that consumption that pinot creates an empty named kafka consumer group id, we've been trying lots of different configurations in our stream configs in our table index configs but none of them seem to work.
   
   Would you please give us some more insights onto how to properly pass this configuration to pinot's kafka consumer?
   
   Thanks in advance.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] alexandervivas edited a comment on issue #6433: Consumption from external kafka instances in third party services

Posted by GitBox <gi...@apache.org>.
alexandervivas edited a comment on issue #6433:
URL: https://github.com/apache/incubator-pinot/issues/6433#issuecomment-809409668


   > I have this issue too.
   > Our kafka platform must be accessed with an assigned group.id.
   
   @kyostyle1 Did you find a solution? I ended up coding something in a fork.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] alexandervivas edited a comment on issue #6433: Consumption from external kafka instances in third party services

Posted by GitBox <gi...@apache.org>.
alexandervivas edited a comment on issue #6433:
URL: https://github.com/apache/incubator-pinot/issues/6433#issuecomment-759484544


   Hey @fx19880617 and @npawar, thanks for commenting.
   
   By the time we started to test pinot for our analytics modules in pinot docs there wasn't any way to configure a LLC since for that we needed the kafka's zookeeper address and we don't have access to it in the third party platform so we decided to go for the one that could read the entire stream.
   
   So far our table config have been looked like this:
   
   ```
   {
     "tableName":"dpt_video_event_captured_v2",
     "tableType":"REALTIME",
     "segmentsConfig":{
       "timeColumnName":"EVENT_TIME",
       "timeType":"SECONDS",
       "retentionTimeUnit":"DAYS",
       "retentionTimeValue":"3650",
       "segmentAssignmentStrategy":"BalanceNumSegmentAssignmentStrategy",
       "schemaName":"dpt_video_event_captured_v2",
       "replication":"1",
       "replicasPerPartition":"1"
     },
     "tenants":{
   
     },
     "tableIndexConfig":{
       "loadMode":"MMAP",
       "streamConfigs":{
         "streamType":"kafka",
         "stream.kafka.consumer.type":"simple",
         "stream.kafka.topic.name":"dpt_video_event-captured_v2",
         "stream.kafka.decoder.class.name":"org.apache.pinot.plugin.inputformat.avro.confluent.KafkaConfluentSchemaRegistryAvroMessageDecoder",
         "stream.kafka.consumer.factory.class.name":"org.apache.pinot.plugin.stream.kafka20.KafkaConsumerFactory",
         "stream.kafka.broker.list":"$KAFKA_BROKER_URL",
         "stream.kafka.hlc.bootstrap.server":"$KAFKA_BROKER_URL",
         "schema.registry.url":"$SCHEMA_REGISTRY_URL",
         "security.protocol":"SSL",
         "ssl.truststore.location":"/opt/pinot/deployment/kafka/truststore.jks",
         "ssl.truststore.password":"$KAFKA_BROKER_PASSWORD",
         "ssl.keystore.location":"/opt/pinot/deployment/kafka/keystore.p12",
         "ssl.keystore.password":"$KAFKA_BROKER_PASSWORD",
         "ssl.keystore.type":"PKCS12",
         "ssl.key.password":"$KAFKA_BROKER_PASSWORD",
         "stream.kafka.decoder.prop.schema.registry.rest.url":"$SCHEMA_REGISTRY_URL",
         "stream.kafka.decoder.prop.schema.registry.ssl.truststore.location":"/opt/pinot/deployment/sr/truststore.jks",
         "stream.kafka.decoder.prop.schema.registry.ssl.truststore.password":"$SCHEMA_REGISTRY_PASSWORD",
         "stream.kafka.decoder.prop.schema.registry.ssl.truststore.type":"JKS",
         "stream.kafka.decoder.prop.schema.registry.ssl.keystore.location":"/opt/pinot/deployment/sr/keystore.p12",
         "stream.kafka.decoder.prop.schema.registry.ssl.keystore.password":"$SCHEMA_REGISTRY_PASSWORD",
         "stream.kafka.decoder.prop.schema.registry.ssl.keystore.type":"PKCS12",
         "stream.kafka.decoder.prop.schema.registry.ssl.key.password":"$SCHEMA_REGISTRY_PASSWORD",
         "stream.kafka.decoder.prop.schema.registry.ssl.protocol":"SSL",
         "stream.kafka.decoder.prop.schema.registry.basic.auth.credentials.source":"USER_INFO",
         "stream.kafka.decoder.prop.schema.registry.basic.auth.user.info":"$SCHEMA_REGISTRY_USERNAME:$SCHEMA_REGISTRY_PASSWORD",
         "realtime.segment.flush.threshold.time":"30s",
         "realtime.segment.flush.threshold.size":"0",
         "realtime.segment.flush.desired.size":"500M",
         "stream.kafka.consumer.prop.auto.offset.reset":"smallest",
         "stream.kafka.hlc.group.id":"dtp-mls-pinot-tables-$NAMESPACE"
       },
       "invertedIndexColumns":[ "EXTRAPARAM15" ]
     },
     "metadata":{
       "customConfigs":{
   
       }
     }
   }
   ```
   
   We know very little about how to properly configure it and we're trying to tune it by using your auto tune tool, but we don't actually know for sure, for example, if all those config params are actually necessary (besides the ones used to connect to kafka via basic auth with ssl certificates).
   
   Now I see the docs changed and now it is possible to configure it using an LLC but then, if the group id is not needed, or used in the path consumption, how do we properly Identify pinot's consumers when we go into the third party platform?
   
   If for some reason we want to play a bit with that topic for other kind of analytics then how do we make pinot consume it without getting in the way of the first one?
   
   @npawar, with this configuration we are getting data from kafka and are able to query it and get data.
   
   Any suggestions are greatly appreciated.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] alexandervivas commented on issue #6433: Consumption from external kafka instances in third party services

Posted by GitBox <gi...@apache.org>.
alexandervivas commented on issue #6433:
URL: https://github.com/apache/incubator-pinot/issues/6433#issuecomment-759484544


   Hey @fx19880617 and @npawar, thanks for commenting.
   
   By the time we started to test pinot for our analytics modules in pinot docs there wasn't any way to configure a LLC since for that we needed the kafka's zookeeper address and we don't have access to it in the third party platform so we decided to go for the one that could read the entire stream.
   
   So far our table config have been looked like this:
   
   `{
     "tableName":"dpt_video_event_captured_v2",
     "tableType":"REALTIME",
     "segmentsConfig":{
       "timeColumnName":"EVENT_TIME",
       "timeType":"SECONDS",
       "retentionTimeUnit":"DAYS",
       "retentionTimeValue":"3650",
       "segmentAssignmentStrategy":"BalanceNumSegmentAssignmentStrategy",
       "schemaName":"dpt_video_event_captured_v2",
       "replication":"1",
       "replicasPerPartition":"1"
     },
     "tenants":{
   
     },
     "tableIndexConfig":{
       "loadMode":"MMAP",
       "streamConfigs":{
         "streamType":"kafka",
         "stream.kafka.consumer.type":"simple",
         "stream.kafka.topic.name":"dpt_video_event-captured_v2",
         "stream.kafka.decoder.class.name":"org.apache.pinot.plugin.inputformat.avro.confluent.KafkaConfluentSchemaRegistryAvroMessageDecoder",
         "stream.kafka.consumer.factory.class.name":"org.apache.pinot.plugin.stream.kafka20.KafkaConsumerFactory",
         "stream.kafka.broker.list":"$KAFKA_BROKER_URL",
         "stream.kafka.hlc.bootstrap.server":"$KAFKA_BROKER_URL",
         "schema.registry.url":"$SCHEMA_REGISTRY_URL",
         "security.protocol":"SSL",
         "ssl.truststore.location":"/opt/pinot/deployment/kafka/truststore.jks",
         "ssl.truststore.password":"$KAFKA_BROKER_PASSWORD",
         "ssl.keystore.location":"/opt/pinot/deployment/kafka/keystore.p12",
         "ssl.keystore.password":"$KAFKA_BROKER_PASSWORD",
         "ssl.keystore.type":"PKCS12",
         "ssl.key.password":"$KAFKA_BROKER_PASSWORD",
         "stream.kafka.decoder.prop.schema.registry.rest.url":"$SCHEMA_REGISTRY_URL",
         "stream.kafka.decoder.prop.schema.registry.ssl.truststore.location":"/opt/pinot/deployment/sr/truststore.jks",
         "stream.kafka.decoder.prop.schema.registry.ssl.truststore.password":"$SCHEMA_REGISTRY_PASSWORD",
         "stream.kafka.decoder.prop.schema.registry.ssl.truststore.type":"JKS",
         "stream.kafka.decoder.prop.schema.registry.ssl.keystore.location":"/opt/pinot/deployment/sr/keystore.p12",
         "stream.kafka.decoder.prop.schema.registry.ssl.keystore.password":"$SCHEMA_REGISTRY_PASSWORD",
         "stream.kafka.decoder.prop.schema.registry.ssl.keystore.type":"PKCS12",
         "stream.kafka.decoder.prop.schema.registry.ssl.key.password":"$SCHEMA_REGISTRY_PASSWORD",
         "stream.kafka.decoder.prop.schema.registry.ssl.protocol":"SSL",
         "stream.kafka.decoder.prop.schema.registry.basic.auth.credentials.source":"USER_INFO",
         "stream.kafka.decoder.prop.schema.registry.basic.auth.user.info":"$SCHEMA_REGISTRY_USERNAME:$SCHEMA_REGISTRY_PASSWORD",
         "realtime.segment.flush.threshold.time":"30s",
         "realtime.segment.flush.threshold.size":"0",
         "realtime.segment.flush.desired.size":"500M",
         "stream.kafka.consumer.prop.auto.offset.reset":"smallest",
         "stream.kafka.hlc.group.id":"dtp-mls-pinot-tables-$NAMESPACE"
       },
       "invertedIndexColumns":[ "EXTRAPARAM15" ]
     },
     "metadata":{
       "customConfigs":{
   
       }
     }
   }`
   
   We know very little about how to properly configure it and we're trying to tune it by using your auto tune tool, but we don't actually know for sure, for example, if all those config params are actually necessary (besides the ones used to connect to kafka via basic auth with ssl certificates).
   
   Now I see the docs changed and now it is possible to configure it using an LLC but then, if the group id is not needed, or used in the path consumption, how do we properly Identify pinot's consumers when we go into the third party platform?
   
   If for some reason we want to play a bit with that topic for other kind of analytics then how do we make pinot consume it without getting in the way of the first one?
   
   Any suggestions are greatly appreciated.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] npawar commented on issue #6433: Consumption from external kafka instances in third party services

Posted by GitBox <gi...@apache.org>.
npawar commented on issue #6433:
URL: https://github.com/apache/incubator-pinot/issues/6433#issuecomment-759077222


   As Xiang mentioned, you don't need to worry about group id for LLC. Can you share your Pinot table config? Are you able to query the table and get results?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] fx19880617 commented on issue #6433: Consumption from external kafka instances in third party services

Posted by GitBox <gi...@apache.org>.
fx19880617 commented on issue #6433:
URL: https://github.com/apache/incubator-pinot/issues/6433#issuecomment-759067772


   @npawar 
   Are you using Kafka high-level consumer(HLC) of simple consumer(LLC)? if using LLC, then group id is not really used, in our consumption path.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] kyostyle1 commented on issue #6433: Consumption from external kafka instances in third party services

Posted by GitBox <gi...@apache.org>.
kyostyle1 commented on issue #6433:
URL: https://github.com/apache/incubator-pinot/issues/6433#issuecomment-781237508


   I have this issue too.
   Our kafka platform must be accessed with an assigned group.id.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] kyostyle1 edited a comment on issue #6433: Consumption from external kafka instances in third party services

Posted by GitBox <gi...@apache.org>.
kyostyle1 edited a comment on issue #6433:
URL: https://github.com/apache/incubator-pinot/issues/6433#issuecomment-812479016


   > > I have this issue too.
   > > Our kafka platform must be accessed with an assigned group.id.
   > 
   > @kyostyle1 Did you find a solution? I ended up coding something in a fork.
   
   @alexandervivas Could you share you fork. I'm has problem cannot query from brokers to realtime server when adding group.id in consumer path. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] alexandervivas commented on issue #6433: Consumption from external kafka instances in third party services

Posted by GitBox <gi...@apache.org>.
alexandervivas commented on issue #6433:
URL: https://github.com/apache/incubator-pinot/issues/6433#issuecomment-809409668


   > I have this issue too.
   > Our kafka platform must be accessed with an assigned group.id.
   
   Did you find a solution? I ended up coding something in a fork.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] alexandervivas edited a comment on issue #6433: Consumption from external kafka instances in third party services

Posted by GitBox <gi...@apache.org>.
alexandervivas edited a comment on issue #6433:
URL: https://github.com/apache/incubator-pinot/issues/6433#issuecomment-759484544


   Hey @fx19880617 and @npawar, thanks for commenting.
   
   By the time we started to test pinot for our analytics modules in pinot docs there wasn't any way to configure a LLC since for that we needed the kafka's zookeeper address and we don't have access to it in the third party platform so we decided to go for the one that could read the entire stream.
   
   So far our table config have been looked like this:
   
   ```{
     "tableName":"dpt_video_event_captured_v2",
     "tableType":"REALTIME",
     "segmentsConfig":{
       "timeColumnName":"EVENT_TIME",
       "timeType":"SECONDS",
       "retentionTimeUnit":"DAYS",
       "retentionTimeValue":"3650",
       "segmentAssignmentStrategy":"BalanceNumSegmentAssignmentStrategy",
       "schemaName":"dpt_video_event_captured_v2",
       "replication":"1",
       "replicasPerPartition":"1"
     },
     "tenants":{
   
     },
     "tableIndexConfig":{
       "loadMode":"MMAP",
       "streamConfigs":{
         "streamType":"kafka",
         "stream.kafka.consumer.type":"simple",
         "stream.kafka.topic.name":"dpt_video_event-captured_v2",
         "stream.kafka.decoder.class.name":"org.apache.pinot.plugin.inputformat.avro.confluent.KafkaConfluentSchemaRegistryAvroMessageDecoder",
         "stream.kafka.consumer.factory.class.name":"org.apache.pinot.plugin.stream.kafka20.KafkaConsumerFactory",
         "stream.kafka.broker.list":"$KAFKA_BROKER_URL",
         "stream.kafka.hlc.bootstrap.server":"$KAFKA_BROKER_URL",
         "schema.registry.url":"$SCHEMA_REGISTRY_URL",
         "security.protocol":"SSL",
         "ssl.truststore.location":"/opt/pinot/deployment/kafka/truststore.jks",
         "ssl.truststore.password":"$KAFKA_BROKER_PASSWORD",
         "ssl.keystore.location":"/opt/pinot/deployment/kafka/keystore.p12",
         "ssl.keystore.password":"$KAFKA_BROKER_PASSWORD",
         "ssl.keystore.type":"PKCS12",
         "ssl.key.password":"$KAFKA_BROKER_PASSWORD",
         "stream.kafka.decoder.prop.schema.registry.rest.url":"$SCHEMA_REGISTRY_URL",
         "stream.kafka.decoder.prop.schema.registry.ssl.truststore.location":"/opt/pinot/deployment/sr/truststore.jks",
         "stream.kafka.decoder.prop.schema.registry.ssl.truststore.password":"$SCHEMA_REGISTRY_PASSWORD",
         "stream.kafka.decoder.prop.schema.registry.ssl.truststore.type":"JKS",
         "stream.kafka.decoder.prop.schema.registry.ssl.keystore.location":"/opt/pinot/deployment/sr/keystore.p12",
         "stream.kafka.decoder.prop.schema.registry.ssl.keystore.password":"$SCHEMA_REGISTRY_PASSWORD",
         "stream.kafka.decoder.prop.schema.registry.ssl.keystore.type":"PKCS12",
         "stream.kafka.decoder.prop.schema.registry.ssl.key.password":"$SCHEMA_REGISTRY_PASSWORD",
         "stream.kafka.decoder.prop.schema.registry.ssl.protocol":"SSL",
         "stream.kafka.decoder.prop.schema.registry.basic.auth.credentials.source":"USER_INFO",
         "stream.kafka.decoder.prop.schema.registry.basic.auth.user.info":"$SCHEMA_REGISTRY_USERNAME:$SCHEMA_REGISTRY_PASSWORD",
         "realtime.segment.flush.threshold.time":"30s",
         "realtime.segment.flush.threshold.size":"0",
         "realtime.segment.flush.desired.size":"500M",
         "stream.kafka.consumer.prop.auto.offset.reset":"smallest",
         "stream.kafka.hlc.group.id":"dtp-mls-pinot-tables-$NAMESPACE"
       },
       "invertedIndexColumns":[ "EXTRAPARAM15" ]
     },
     "metadata":{
       "customConfigs":{
   
       }
     }
   }```
   
   We know very little about how to properly configure it and we're trying to tune it by using your auto tune tool, but we don't actually know for sure, for example, if all those config params are actually necessary (besides the ones used to connect to kafka via basic auth with ssl certificates).
   
   Now I see the docs changed and now it is possible to configure it using an LLC but then, if the group id is not needed, or used in the path consumption, how do we properly Identify pinot's consumers when we go into the third party platform?
   
   If for some reason we want to play a bit with that topic for other kind of analytics then how do we make pinot consume it without getting in the way of the first one?
   
   Any suggestions are greatly appreciated.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] alexandervivas edited a comment on issue #6433: Consumption from external kafka instances in third party services

Posted by GitBox <gi...@apache.org>.
alexandervivas edited a comment on issue #6433:
URL: https://github.com/apache/incubator-pinot/issues/6433#issuecomment-759484544


   Hey @fx19880617 and @npawar, thanks for commenting.
   
   By the time we started to test pinot for our analytics modules in pinot docs there wasn't any way to configure a LLC since for that we needed the kafka's zookeeper address and we don't have access to it in the third party platform so we decided to go for the one that could read the entire stream.
   
   So far our table config have been looked like this:
   
   ```
   {
     "tableName":"dpt_video_event_captured_v2",
     "tableType":"REALTIME",
     "segmentsConfig":{
       "timeColumnName":"EVENT_TIME",
       "timeType":"SECONDS",
       "retentionTimeUnit":"DAYS",
       "retentionTimeValue":"3650",
       "segmentAssignmentStrategy":"BalanceNumSegmentAssignmentStrategy",
       "schemaName":"dpt_video_event_captured_v2",
       "replication":"1",
       "replicasPerPartition":"1"
     },
     "tenants":{
   
     },
     "tableIndexConfig":{
       "loadMode":"MMAP",
       "streamConfigs":{
         "streamType":"kafka",
         "stream.kafka.consumer.type":"simple",
         "stream.kafka.topic.name":"dpt_video_event-captured_v2",
         "stream.kafka.decoder.class.name":"org.apache.pinot.plugin.inputformat.avro.confluent.KafkaConfluentSchemaRegistryAvroMessageDecoder",
         "stream.kafka.consumer.factory.class.name":"org.apache.pinot.plugin.stream.kafka20.KafkaConsumerFactory",
         "stream.kafka.broker.list":"$KAFKA_BROKER_URL",
         "stream.kafka.hlc.bootstrap.server":"$KAFKA_BROKER_URL",
         "schema.registry.url":"$SCHEMA_REGISTRY_URL",
         "security.protocol":"SSL",
         "ssl.truststore.location":"/opt/pinot/deployment/kafka/truststore.jks",
         "ssl.truststore.password":"$KAFKA_BROKER_PASSWORD",
         "ssl.keystore.location":"/opt/pinot/deployment/kafka/keystore.p12",
         "ssl.keystore.password":"$KAFKA_BROKER_PASSWORD",
         "ssl.keystore.type":"PKCS12",
         "ssl.key.password":"$KAFKA_BROKER_PASSWORD",
         "stream.kafka.decoder.prop.schema.registry.rest.url":"$SCHEMA_REGISTRY_URL",
         "stream.kafka.decoder.prop.schema.registry.ssl.truststore.location":"/opt/pinot/deployment/sr/truststore.jks",
         "stream.kafka.decoder.prop.schema.registry.ssl.truststore.password":"$SCHEMA_REGISTRY_PASSWORD",
         "stream.kafka.decoder.prop.schema.registry.ssl.truststore.type":"JKS",
         "stream.kafka.decoder.prop.schema.registry.ssl.keystore.location":"/opt/pinot/deployment/sr/keystore.p12",
         "stream.kafka.decoder.prop.schema.registry.ssl.keystore.password":"$SCHEMA_REGISTRY_PASSWORD",
         "stream.kafka.decoder.prop.schema.registry.ssl.keystore.type":"PKCS12",
         "stream.kafka.decoder.prop.schema.registry.ssl.key.password":"$SCHEMA_REGISTRY_PASSWORD",
         "stream.kafka.decoder.prop.schema.registry.ssl.protocol":"SSL",
         "stream.kafka.decoder.prop.schema.registry.basic.auth.credentials.source":"USER_INFO",
         "stream.kafka.decoder.prop.schema.registry.basic.auth.user.info":"$SCHEMA_REGISTRY_USERNAME:$SCHEMA_REGISTRY_PASSWORD",
         "realtime.segment.flush.threshold.time":"30s",
         "realtime.segment.flush.threshold.size":"0",
         "realtime.segment.flush.desired.size":"500M",
         "stream.kafka.consumer.prop.auto.offset.reset":"smallest",
         "stream.kafka.hlc.group.id":"dtp-mls-pinot-tables-$NAMESPACE"
       },
       "invertedIndexColumns":[ "EXTRAPARAM15" ]
     },
     "metadata":{
       "customConfigs":{
   
       }
     }
   }
   ```
   
   We know very little about how to properly configure it and we're trying to tune it by using your auto tune tool, but we don't actually know for sure, for example, if all those config params are actually necessary (besides the ones used to connect to kafka via basic auth with ssl certificates).
   
   Now I see the docs changed and now it is possible to configure it using an LLC but then, if the group id is not needed, or used in the path consumption, how do we properly Identify pinot's consumers when we go into the third party platform?
   
   If for some reason we want to play a bit with that topic for other kind of analytics then how do we make pinot consume it without getting in the way of the first one?
   
   Any suggestions are greatly appreciated.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] alexandervivas edited a comment on issue #6433: Consumption from external kafka instances in third party services

Posted by GitBox <gi...@apache.org>.
alexandervivas edited a comment on issue #6433:
URL: https://github.com/apache/incubator-pinot/issues/6433#issuecomment-759484544


   Hey @fx19880617 and @npawar, thanks for commenting.
   
   By the time we started to test pinot for our analytics modules in pinot docs there wasn't any way to configure a LLC since for that we needed the kafka's zookeeper address and we don't have access to it in the third party platform so we decided to go for the one that could read the entire stream.
   
   So far our table config have been looked like this:
   
   ```
   {
     "tableName":"dpt_video_event_captured_v2",
     "tableType":"REALTIME",
     "segmentsConfig":{
       "timeColumnName":"EVENT_TIME",
       "timeType":"SECONDS",
       "retentionTimeUnit":"DAYS",
       "retentionTimeValue":"3650",
       "segmentAssignmentStrategy":"BalanceNumSegmentAssignmentStrategy",
       "schemaName":"dpt_video_event_captured_v2",
       "replication":"1",
       "replicasPerPartition":"1"
     },
     "tenants":{
   
     },
     "tableIndexConfig":{
       "loadMode":"MMAP",
       "streamConfigs":{
         "streamType":"kafka",
         "stream.kafka.consumer.type":"simple",
         "stream.kafka.topic.name":"dpt_video_event-captured_v2",
         "stream.kafka.decoder.class.name":"org.apache.pinot.plugin.inputformat.avro.confluent.KafkaConfluentSchemaRegistryAvroMessageDecoder",
         "stream.kafka.consumer.factory.class.name":"org.apache.pinot.plugin.stream.kafka20.KafkaConsumerFactory",
         "stream.kafka.broker.list":"$KAFKA_BROKER_URL",
         "stream.kafka.hlc.bootstrap.server":"$KAFKA_BROKER_URL",
         "schema.registry.url":"$SCHEMA_REGISTRY_URL",
         "security.protocol":"SSL",
         "ssl.truststore.location":"/opt/pinot/deployment/kafka/truststore.jks",
         "ssl.truststore.password":"$KAFKA_BROKER_PASSWORD",
         "ssl.keystore.location":"/opt/pinot/deployment/kafka/keystore.p12",
         "ssl.keystore.password":"$KAFKA_BROKER_PASSWORD",
         "ssl.keystore.type":"PKCS12",
         "ssl.key.password":"$KAFKA_BROKER_PASSWORD",
         "stream.kafka.decoder.prop.schema.registry.rest.url":"$SCHEMA_REGISTRY_URL",
         "stream.kafka.decoder.prop.schema.registry.ssl.truststore.location":"/opt/pinot/deployment/sr/truststore.jks",
         "stream.kafka.decoder.prop.schema.registry.ssl.truststore.password":"$SCHEMA_REGISTRY_PASSWORD",
         "stream.kafka.decoder.prop.schema.registry.ssl.truststore.type":"JKS",
         "stream.kafka.decoder.prop.schema.registry.ssl.keystore.location":"/opt/pinot/deployment/sr/keystore.p12",
         "stream.kafka.decoder.prop.schema.registry.ssl.keystore.password":"$SCHEMA_REGISTRY_PASSWORD",
         "stream.kafka.decoder.prop.schema.registry.ssl.keystore.type":"PKCS12",
         "stream.kafka.decoder.prop.schema.registry.ssl.key.password":"$SCHEMA_REGISTRY_PASSWORD",
         "stream.kafka.decoder.prop.schema.registry.ssl.protocol":"SSL",
         "stream.kafka.decoder.prop.schema.registry.basic.auth.credentials.source":"USER_INFO",
         "stream.kafka.decoder.prop.schema.registry.basic.auth.user.info":"$SCHEMA_REGISTRY_USERNAME:$SCHEMA_REGISTRY_PASSWORD",
         "realtime.segment.flush.threshold.time":"30s",
         "realtime.segment.flush.threshold.size":"0",
         "realtime.segment.flush.desired.size":"500M",
         "stream.kafka.consumer.prop.auto.offset.reset":"smallest",
         "stream.kafka.hlc.group.id":"dtp-mls-pinot-tables-$NAMESPACE"
       },
       "invertedIndexColumns":[ "EXTRAPARAM15" ]
     },
     "metadata":{
       "customConfigs":{
   
       }
     }
   }
   ```
   
   We know very little about how to properly configure it and we're trying to tune it by using your auto tune tool, but we don't actually know for sure, for example, if all those config params are actually necessary (besides the ones used to connect to kafka via basic auth with ssl certificates).
   
   Now I see the docs changed and now it is possible to configure it using an LLC but then, if the group id is not needed, or used in the path consumption, how do we properly Identify pinot's consumers when we go into the third party platform?
   
   If for some reason we want to play a bit with that topic for other kind of analytics then how do we make pinot consume it without getting in the way of the first one?
   
   @npawar, with this configuration we are getting data from kafka and are able to query it, but some of the records seem not to be getting ingested into pinot because of some NPEs, and we are trying to debug it right now.
   
   Any suggestions are greatly appreciated.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] kyostyle1 commented on issue #6433: Consumption from external kafka instances in third party services

Posted by GitBox <gi...@apache.org>.
kyostyle1 commented on issue #6433:
URL: https://github.com/apache/incubator-pinot/issues/6433#issuecomment-812479016


   > > I have this issue too.
   > > Our kafka platform must be accessed with an assigned group.id.
   > 
   > @kyostyle1 Did you find a solution? I ended up coding something in a fork.
   
   Could you share you fork. I'm has problem cannot query from brokers to realtime server when adding group.id in consumer path. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] alexandervivas closed issue #6433: Consumption from external kafka instances in third party services

Posted by GitBox <gi...@apache.org>.
alexandervivas closed issue #6433:
URL: https://github.com/apache/incubator-pinot/issues/6433


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] alexandervivas commented on issue #6433: Consumption from external kafka instances in third party services

Posted by GitBox <gi...@apache.org>.
alexandervivas commented on issue #6433:
URL: https://github.com/apache/incubator-pinot/issues/6433#issuecomment-812481456


   @kyostyle1, sorry for the inconvenience, what I actually coded was for pinot to be able to connect to kafka clusters hosted in other platforms through ssl certificates and basic auth.
   
   I'm afraid this will be a lost cause because for them to be able to manually select the offsets where they need to consume messages from mostly for data recovering purposes and consistency in the segments they don't really need and don't use a group id.
   
   It won't be a problem because if for example you want to create a replica of your pinot cluster elsewhere, since they keep track of the offsets manually then there won't be a problem having an empty group id for those consumers, but that is just my guess, maybe @npawar or @fx19880617 can give more insights on this just to make sure what I said is correct and proceed to close this issue?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org