You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2020/08/17 20:42:27 UTC

[GitHub] [incubator-pinot] lgo opened a new issue #5883: Ingestion spec templating is rather undocumented

lgo opened a new issue #5883:
URL: https://github.com/apache/incubator-pinot/issues/5883


   While trying to set up ingestion, I ran across the `-propertyFile` and `-values` arguments in `LaunchDataIngestionJobCommand`. I didn't see anything in the docs about templating the job spec, and only later found one reference describing it in https://github.com/pinot-contrib/pinot-docs/blob/eb9a8a07687bfe78b022ba0825123fd43e316795/operators/cli.md.
   
   This would be helpful to document and also answer questions such as:
   * What the format of the `propertyFile` is.
   * What the format for templated variables in the job spec file should be.
   
   My particular use-case where this is great is in a setup where ingestion (via Spark or Hadoop) are only distributed as single JAR, and hooking up external file dependencies is a pain. For this, I'd ideally like to (1) bundle a basic configuration file as a resource in the JAR (TBD on that) or as a separate distribution (2) provide any overrides at run-time via parameters (eg: by a scheduler application)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] kishoreg commented on issue #5883: Ingestion spec templating is rather undocumented

Posted by GitBox <gi...@apache.org>.
kishoreg commented on issue #5883:
URL: https://github.com/apache/incubator-pinot/issues/5883#issuecomment-675120018


   single jar vs multiple external jars is something that keeps coming up once in a while. The biggest challenge is dealing with various versions of executions framework (spark, hadoop), file systems (hdfs, s3,gcs,adsl) and data format (json, csv, parquet, avro, protobuf) etc.
   
   Creating one jar for a each combination is not scalable. thats the reason we added the concept of plugin and allowed users to include only the plugin's needed.
   
   One other alternative is to provide a standalone tool that takes plugin names as input and create a uber jar. WDYT?
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] kishoreg edited a comment on issue #5883: Ingestion spec templating is rather undocumented

Posted by GitBox <gi...@apache.org>.
kishoreg edited a comment on issue #5883:
URL: https://github.com/apache/incubator-pinot/issues/5883#issuecomment-675120018


   single jar vs multiple external jars is something that keeps coming up once in a while. The biggest challenge is dealing with various versions of executions framework (spark, hadoop), file systems (hdfs, s3,gcs,adsl) and data format (json, csv, parquet, avro, protobuf) etc.
   
   Creating one jar for each combination is not scalable. that's the reason we added the concept of plugins and allowed users to include only the plugin's needed.
   
   One other alternative is to provide a standalone tool that takes plugin names as input and create a uber jar. WDYT?
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org