You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2021/03/05 01:30:36 UTC

[GitHub] [incubator-pinot] dubin555 opened a new issue #6648: Support nested json schema in realtime ingest

dubin555 opened a new issue #6648:
URL: https://github.com/apache/incubator-pinot/issues/6648


   Hi forks,
   Does the streaming ingest function support the nested json schema, e.g.
   `{
     "a": {
       "b": "xxx"
     }
   }`
   Is there a way like this in the schema statement: `"a.b" is a string`
   
   Thanks


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] kishoreg commented on issue #6648: Support nested json schema in realtime ingest

Posted by GitBox <gi...@apache.org>.
kishoreg commented on issue #6648:
URL: https://github.com/apache/incubator-pinot/issues/6648#issuecomment-792509612


   Yes. You can create a column “name” and use a transform function
   
   see json functions here
   https://docs.pinot.apache.org/users/user-guide-query/supported-transformations
   
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] Jackie-Jiang commented on issue #6648: Support nested json schema in realtime ingest

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on issue #6648:
URL: https://github.com/apache/incubator-pinot/issues/6648#issuecomment-791640826


   For the json field, you can configure it as string field in the schema, and enable json index for it (Check https://docs.pinot.apache.org/basics/indexing/json-index for more details)
   Note that the input data needs to be json strings. You may use `jsonFormat()` function to convert object to json string if needed.
   
   For Kafka real-time ingestion, each Kafka partition is consumed by one thread (one consuming segment), and you can configure multiple replications of the segment


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] dubin555 commented on issue #6648: Support nested json schema in realtime ingest

Posted by GitBox <gi...@apache.org>.
dubin555 commented on issue #6648:
URL: https://github.com/apache/incubator-pinot/issues/6648#issuecomment-792486881


   > For the json field, you can configure it as string field in the schema, and enable json index for it (Check https://docs.pinot.apache.org/basics/indexing/json-index for more details)
   > Note that the input data needs to be json strings. You may use `jsonFormat()` function to convert object to json string if needed.
   
   > For Kafka real-time ingestion, each Kafka partition is consumed by one thread (one consuming segment), and you can configure multiple replications of the segment
   
   Thanks for the reply! I check the link that you posted. If the upstream data is in nested json, like this: `{"props":{"name":"tom"}}`, can data be stored in a column field like 'name' WITHOUT nested 'props'? 
   
   I don't quite understand. Suppose the upstream Kafka topic has 10 partitions, and I want to configure 2 consumer threads. Question 1, where to config the setting. Question 2. If lag happens, I want to increase to 10 threads, how to change the setting


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] dubin555 commented on issue #6648: Support nested json schema in realtime ingest

Posted by GitBox <gi...@apache.org>.
dubin555 commented on issue #6648:
URL: https://github.com/apache/incubator-pinot/issues/6648#issuecomment-791101882


   > Do you want the column name to be a.b and value xxx?
   
   yes, sir


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] kishoreg commented on issue #6648: Support nested json schema in realtime ingest

Posted by GitBox <gi...@apache.org>.
kishoreg commented on issue #6648:
URL: https://github.com/apache/incubator-pinot/issues/6648#issuecomment-792509991


   We create one thread per partition automatically.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] dubin555 commented on issue #6648: Support nested json schema in realtime ingest

Posted by GitBox <gi...@apache.org>.
dubin555 commented on issue #6648:
URL: https://github.com/apache/incubator-pinot/issues/6648#issuecomment-791256094


   Also does the kafka realtime ingest part have resource control? e.g. how many consumer thread


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] dubin555 closed issue #6648: Support nested json schema in realtime ingest

Posted by GitBox <gi...@apache.org>.
dubin555 closed issue #6648:
URL: https://github.com/apache/incubator-pinot/issues/6648


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] kishoreg commented on issue #6648: Support nested json schema in realtime ingest

Posted by GitBox <gi...@apache.org>.
kishoreg commented on issue #6648:
URL: https://github.com/apache/incubator-pinot/issues/6648#issuecomment-791076731


   Do you want the column name to be a.b and value xxx?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org