You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2019/10/30 05:49:11 UTC

[GitHub] [incubator-pinot] snleee edited a comment on issue #4737: Adding Druid Segment RecordReader

snleee edited a comment on issue #4737: Adding Druid Segment RecordReader
URL: https://github.com/apache/incubator-pinot/pull/4737#issuecomment-547748095
 
 
   > we have too many modules at the top level. lets move the package to a sub-folder (pinot-contrib). renamed the module to druid-2-pinot-migration or something like that.
   > 
   > How can a user this tool?
   > 
   > Will it be something like, Druid2Pinot <druid_cluster_endpoint_config> <pinot_cluster_endpoint_config>?
   > 
   > At a high level, what are the various steps involved in this? A simple issue with some description will help in understanding the goal and reviewing this PR.
   @kishoreg 
   
   We will address the package location before we check in this pr. We were also debating top level vs `pinot-contrib`. 
   
   What's your opinion on adding druid dependency to `pinot-tools` or `pinot-hadoop`? If we want to use existing segment generation tools (e.g. `pinot-admin's SegmentGenerationCommand` or in `PinotSegmentGeneration` hadoop job), we need to pull druid dependency. (`pinot-tools` and `pinot-hadoop` will pull `pinot-druid-migration`, which pulls druid.
   
   Another approach is to provide this tool as a completely separate code outside of Pinot project. In that case, we can add the standalone class with its own Main function for cmd line tool and provide a standalone hadoop job. All code will be under `pinot-contrib/pinot-druid-migration` and other part of Pinot code won't import this package.
   
   For user interface, our current approach is to reuse the existing segment generation tools (cmd line, hadoop job) and add `DRUID_SEGMENT` as an extra input data format that we support.
   
   We can add more tools around segment converter to automatize end-to-end migration process.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org