You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2020/10/01 05:52:31 UTC

[GitHub] [incubator-pinot] jackjlli commented on pull request #6070: Add Hadoop related dependencies in pinot-tool module

jackjlli commented on pull request #6070:
URL: https://github.com/apache/incubator-pinot/pull/6070#issuecomment-701903297


   Discussed with @mayankshriv. The issue before this PR is that pinot-orc and pinot-parquet module needs Hadoop libraries. While the Hadoop dependencies are in provided scope, which means these Hadoop jars will not be included in the classpath. Thus, we encounter `NoClassDefFoundError` shown in the description of this PR above. PluginManager doesn't help here because the prerequisite is that it requires the jars that contains the needed classes to be in the classpath at the first place.
   
   There are two ways to solve this issue:
   1. configure the pom so that these Hadoop jars can be shown in the classpath, like this PR does. Another way is to specify the scope of Hadoop dependencies to `compile` in the pom files of pinot-orc and pinot-parquet.
   2. users manually add those Hadoop jars in the classpath when running pinot commands.
   
   The 1st approach seems better, because for the 2nd one, users have to manually add these jars to the classpath one by one. And it'll take multiple times for users to keep adding the missing jars. Plus, we have to tell users to do that.
   
   As to the 1st approach, we can revert the change in this PR and put the Hadoop dependencies down to the pinot-orc and pinot-parquet modules, so that if we don't want to support some formats in the future, the Hadoop dependencies can be removed altogether. I've tested this change in the HadoopSegmentCreationJob and it works well. Here is the new PR: https://github.com/apache/incubator-pinot/pull/6088.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org