You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "David Mollitor (Jira)" <ji...@apache.org> on 2021/01/16 22:30:00 UTC

[jira] [Commented] (HIVE-24348) Beeline: Isolating dependencies and execution with java

    [ https://issues.apache.org/jira/browse/HIVE-24348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17266677#comment-17266677 ] 

David Mollitor commented on HIVE-24348:
---------------------------------------

So, one of the things that you will have to consider with this is the utility of the {{dfs}} command that is built into beeline.


https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients#HiveServer2Clients-BeelineHiveCommands

If this feature is going to continue then beeline will need to be able to include the Hadoop JARs.  Beeline could explicitly include the Hadoop libraries as a runtime dependency instead of dynamically loading it from {{HADOOP_HOME}}, and that would make things better and more standalone.  Maybe that is a good improvement, but it would add a LOT of transitive dependencies to beeline to support HDFS, Azure, S3, etc.

I always lean in the favor of KISS.  My gut says that this {{dfs}} command can be removed and make beeline just a simple JDBC tool, but I'm not sure.

> Beeline: Isolating dependencies and execution with java
> -------------------------------------------------------
>
>                 Key: HIVE-24348
>                 URL: https://issues.apache.org/jira/browse/HIVE-24348
>             Project: Hive
>          Issue Type: Improvement
>          Components: Beeline
>    Affects Versions: 3.1.0
>            Reporter: Naveen Gangam
>            Assignee: Naveen Gangam
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> Currently, beeline code, binaries and executables are somewhat tightly coupled with the hive product. To be able to execute beeline from a node with just JRE installed and some jars in classpath is impossible.
> * beeline.sh/hive scripts rely on HADOOP_HOME to be set which are designed to use "hadoop" executable to run beeline.
> * Ideally, just the hive-beeline.jar and hive-jdbc-standalone jars should be enough but sadly they arent. The latter jar adds more problems than it solves because all the classfiles are shaded some dependencies cannot be resolved.
> * Beeline has many other dependencies like hive-exec, hive-common. hadoop-common, supercsv, jline, commons-cli, commons-io, commons-logging etc. While it may not be possible to eliminate some of these, we should atleast have a self-contains jar that contains all these to be able to make it work.
> * the underlying script used to run beeline should use JAVA as an alternate means to execute if HADOOP_HOME is not set



--
This message was sent by Atlassian Jira
(v8.3.4#803005)