You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Vijay <te...@gmail.com> on 2009/09/15 06:56:35 UTC

Design help with custom front-end for hive

Hi,

I tried to begin a discussion along these lines earlier but that kind of
went into a digression. My plan is to build a custom front-end for hive.
Hive cli and hwi are great for the needs of developers and such but the idea
here is to make hive a lot more accessible to non-technical and non-sql
users (mostly business and marketing types). It front-end will have some
standard features where you can "build and save" queries for reuse, run
queries and retrieve results in CSV/XLS format, etc. Ideally the front-end
also enables multiple users submitting and running jobs simultaneously I see
2 approaches to this problem.

1. Leverage the JDBC driver with an off-the-shelf or open source report
designing software that has most of the above features already built-in.
There seem to be issues integrating hadoop+hive+jdbcdriver with some of the
existing tools but hopefully those issues could be solved. I believe there
are known integration issues with existing tools due to the minimal JDBC
implementation of hive. Not sure if this should be a major concern.

2. Just build a custom ui from scratch. I guess I could start with hwi and
customize it. Some of the features above probably make sense to fold into
hwi (may be) but some of them would be very specific to our internal systems
and applications. May be it'd be possible to somehow achieve both.

Any thoughts or suggestions are greatly appreciated.

Thanks,
Vijay

Re: Design help with custom front-end for hive

Posted by Vijay <te...@gmail.com>.
I appreciate your interest on this topic and I'm sure some or all of the
changes above can make hwi more accessible but in general, after spending
more time thinking and playing around, I'm convinced the right thing to do
is firm up the jdbc driver and use that as the primary mechanism of
integration with hive.

In fact there is another thread along these lines. There are so many
wonderful report design/management tools out there there is not much point
in reinventing the wheel here right? The primary integration strategy with
all of these tools is via jdbc (or odbc). Hive shouldn't be any different.
But right now hive jdbc driver doesn't seem to play nice out of box with
many tools (admittedly from my limited testing). On windows things are
slightly worse. So I think the first priority along this path should be
firming up the jdbc driver by making it as thin and full featured as
reasonable. It seems so to me anyway :)

This may be something I'll start working on in my copious amounts of free
time :)

Thoughts/pointers/problems?

Thanks for listening!
Vijay

On Sep 15, 2009 1:07 PM, "Edward Capriolo" <ed...@gmail.com> wrote:

On Tue, Sep 15, 2009 at 3:12 PM, Abhijit Pol <ap...@rocketfuelinc.com> wrote:
> +1 > > I should be ab...
Please comment on https://issues.apache.org/jira/browse/HIVE-833
Thank you
Edward

Re: Design help with custom front-end for hive

Posted by Edward Capriolo <ed...@gmail.com>.
On Tue, Sep 15, 2009 at 3:12 PM, Abhijit Pol <ap...@rocketfuelinc.com> wrote:
> +1
>
> I should be able to give feedback and discuss recent changes I made to HWI
> to make it even easier for non-tech customers in the company.
> Couldn't agree more that making it easy to query and look at the results
> makes hive adaptation very easy. People are already eager to use it, we just
> have to make sure not to frustrate them :-)
> Looking forward for Jira on this...
>
> On Tue, Sep 15, 2009 at 8:11 AM, Edward Capriolo <ed...@gmail.com>
> wrote:
>>
>> On Tue, Sep 15, 2009 at 12:56 AM, Vijay <te...@gmail.com> wrote:
>> > Hi,
>> >
>> > I tried to begin a discussion along these lines earlier but that kind of
>> > went into a digression. My plan is to build a custom front-end for hive.
>> > Hive cli and hwi are great for the needs of developers and such but the
>> > idea
>> > here is to make hive a lot more accessible to non-technical and non-sql
>> > users (mostly business and marketing types). It front-end will have some
>> > standard features where you can "build and save" queries for reuse, run
>> > queries and retrieve results in CSV/XLS format, etc. Ideally the
>> > front-end
>> > also enables multiple users submitting and running jobs simultaneously I
>> > see
>> > 2 approaches to this problem.
>> >
>> > 1. Leverage the JDBC driver with an off-the-shelf or open source report
>> > designing software that has most of the above features already built-in.
>> > There seem to be issues integrating hadoop+hive+jdbcdriver with some of
>> > the
>> > existing tools but hopefully those issues could be solved. I believe
>> > there
>> > are known integration issues with existing tools due to the minimal JDBC
>> > implementation of hive. Not sure if this should be a major concern.
>> >
>> > 2. Just build a custom ui from scratch. I guess I could start with hwi
>> > and
>> > customize it. Some of the features above probably make sense to fold
>> > into
>> > hwi (may be) but some of them would be very specific to our internal
>> > systems
>> > and applications. May be it'd be possible to somehow achieve both.
>> >
>> > Any thoughts or suggestions are greatly appreciated.
>> >
>> > Thanks,
>> > Vijay
>>
>> Hey Vijay,
>>
>> As you have pointed out the generic tools are generic :)
>>
>> There are a few things I am looking to do with HWI in the next few weeks
>>
>> Infrastructure
>> 1 ) clean up the build file, add an ivy file, add an eclipse launch-target
>>
>> Web Interface
>> 2 ) Replace all JSP with wikit classes. I have not yet opened a Jira
>> for this but I have mentioned it a few times on the list. Wikit is
>> really nice. It is going to move all the code from JSP to Java
>> classes. It has unit testing. It also has built in Ajax capabilities.
>> Most of the features that are in the web interface right now might be
>> turned into one or two sexy ajax pages. I am looking to do some things
>> like have the new session bucket be a ajax updated table for
>> "streaming results".
>>
>> As for the features you mentioned:
>>
>> 1) standard features where you can "build and save" queries for reuse
>>
>> HWI could simply persist saved queries to a directory on the web
>> server (bean persistence) or to a third party database, or HDFS.
>>
>> 2) queries and retrieve results in CSV/XLS format
>>
>> We can handle this. Giving that the web server has access to the
>> metastore and table schema information. We could read blocks from HDFS
>> and format them appropriately, or select * query them and stream them
>> appropriately.
>>
>> As you mentioned there may be some internal things that the open
>> source web interface will never be able to do. Hopefully we can
>> construct HWI in a way that plugging in new things is easy.
>>
>> As I mentioned, I have been working offline trying to clean up the
>> hwi/build.xml file ( there is some redundant things in it ) and it can
>> be cleaned. This will later be needed to add the wikit jars to hwi/lib
>>
>> In the past adding things like hive-history was done my adding a new
>> JSP and linking to it from the session_manage.jsp. We might need a new
>> paradigm now if we are going to implement an ajax all-in-one page type
>> thing.
>>
>> To do this in an open way (if you want to help) we should open up a
>> Jira and post some mock up screen-shots of how the UI would look and
>> then talk about how we could implement the current/new features.
>>
>> Edward
>
>

Please comment on https://issues.apache.org/jira/browse/HIVE-833
Thank you
Edward

Re: Design help with custom front-end for hive

Posted by Abhijit Pol <ap...@rocketfuelinc.com>.
+1

I should be able to give feedback and discuss recent changes I made to HWI
to make it even easier for non-tech customers in the company.
Couldn't agree more that making it easy to query and look at the results
makes hive adaptation very easy. People are already eager to use it, we just
have to make sure not to frustrate them :-)
Looking forward for Jira on this...

On Tue, Sep 15, 2009 at 8:11 AM, Edward Capriolo <ed...@gmail.com>wrote:

> On Tue, Sep 15, 2009 at 12:56 AM, Vijay <te...@gmail.com> wrote:
> > Hi,
> >
> > I tried to begin a discussion along these lines earlier but that kind of
> > went into a digression. My plan is to build a custom front-end for hive.
> > Hive cli and hwi are great for the needs of developers and such but the
> idea
> > here is to make hive a lot more accessible to non-technical and non-sql
> > users (mostly business and marketing types). It front-end will have some
> > standard features where you can "build and save" queries for reuse, run
> > queries and retrieve results in CSV/XLS format, etc. Ideally the
> front-end
> > also enables multiple users submitting and running jobs simultaneously I
> see
> > 2 approaches to this problem.
> >
> > 1. Leverage the JDBC driver with an off-the-shelf or open source report
> > designing software that has most of the above features already built-in.
> > There seem to be issues integrating hadoop+hive+jdbcdriver with some of
> the
> > existing tools but hopefully those issues could be solved. I believe
> there
> > are known integration issues with existing tools due to the minimal JDBC
> > implementation of hive. Not sure if this should be a major concern.
> >
> > 2. Just build a custom ui from scratch. I guess I could start with hwi
> and
> > customize it. Some of the features above probably make sense to fold into
> > hwi (may be) but some of them would be very specific to our internal
> systems
> > and applications. May be it'd be possible to somehow achieve both.
> >
> > Any thoughts or suggestions are greatly appreciated.
> >
> > Thanks,
> > Vijay
>
> Hey Vijay,
>
> As you have pointed out the generic tools are generic :)
>
> There are a few things I am looking to do with HWI in the next few weeks
>
> Infrastructure
> 1 ) clean up the build file, add an ivy file, add an eclipse launch-target
>
> Web Interface
> 2 ) Replace all JSP with wikit classes. I have not yet opened a Jira
> for this but I have mentioned it a few times on the list. Wikit is
> really nice. It is going to move all the code from JSP to Java
> classes. It has unit testing. It also has built in Ajax capabilities.
> Most of the features that are in the web interface right now might be
> turned into one or two sexy ajax pages. I am looking to do some things
> like have the new session bucket be a ajax updated table for
> "streaming results".
>
> As for the features you mentioned:
>
> 1) standard features where you can "build and save" queries for reuse
>
> HWI could simply persist saved queries to a directory on the web
> server (bean persistence) or to a third party database, or HDFS.
>
> 2) queries and retrieve results in CSV/XLS format
>
> We can handle this. Giving that the web server has access to the
> metastore and table schema information. We could read blocks from HDFS
> and format them appropriately, or select * query them and stream them
> appropriately.
>
> As you mentioned there may be some internal things that the open
> source web interface will never be able to do. Hopefully we can
> construct HWI in a way that plugging in new things is easy.
>
> As I mentioned, I have been working offline trying to clean up the
> hwi/build.xml file ( there is some redundant things in it ) and it can
> be cleaned. This will later be needed to add the wikit jars to hwi/lib
>
> In the past adding things like hive-history was done my adding a new
> JSP and linking to it from the session_manage.jsp. We might need a new
> paradigm now if we are going to implement an ajax all-in-one page type
> thing.
>
> To do this in an open way (if you want to help) we should open up a
> Jira and post some mock up screen-shots of how the UI would look and
> then talk about how we could implement the current/new features.
>
> Edward
>

Re: Design help with custom front-end for hive

Posted by Edward Capriolo <ed...@gmail.com>.
On Tue, Sep 15, 2009 at 12:56 AM, Vijay <te...@gmail.com> wrote:
> Hi,
>
> I tried to begin a discussion along these lines earlier but that kind of
> went into a digression. My plan is to build a custom front-end for hive.
> Hive cli and hwi are great for the needs of developers and such but the idea
> here is to make hive a lot more accessible to non-technical and non-sql
> users (mostly business and marketing types). It front-end will have some
> standard features where you can "build and save" queries for reuse, run
> queries and retrieve results in CSV/XLS format, etc. Ideally the front-end
> also enables multiple users submitting and running jobs simultaneously I see
> 2 approaches to this problem.
>
> 1. Leverage the JDBC driver with an off-the-shelf or open source report
> designing software that has most of the above features already built-in.
> There seem to be issues integrating hadoop+hive+jdbcdriver with some of the
> existing tools but hopefully those issues could be solved. I believe there
> are known integration issues with existing tools due to the minimal JDBC
> implementation of hive. Not sure if this should be a major concern.
>
> 2. Just build a custom ui from scratch. I guess I could start with hwi and
> customize it. Some of the features above probably make sense to fold into
> hwi (may be) but some of them would be very specific to our internal systems
> and applications. May be it'd be possible to somehow achieve both.
>
> Any thoughts or suggestions are greatly appreciated.
>
> Thanks,
> Vijay

Hey Vijay,

As you have pointed out the generic tools are generic :)

There are a few things I am looking to do with HWI in the next few weeks

Infrastructure
1 ) clean up the build file, add an ivy file, add an eclipse launch-target

Web Interface
2 ) Replace all JSP with wikit classes. I have not yet opened a Jira
for this but I have mentioned it a few times on the list. Wikit is
really nice. It is going to move all the code from JSP to Java
classes. It has unit testing. It also has built in Ajax capabilities.
Most of the features that are in the web interface right now might be
turned into one or two sexy ajax pages. I am looking to do some things
like have the new session bucket be a ajax updated table for
"streaming results".

As for the features you mentioned:

1) standard features where you can "build and save" queries for reuse

HWI could simply persist saved queries to a directory on the web
server (bean persistence) or to a third party database, or HDFS.

2) queries and retrieve results in CSV/XLS format

We can handle this. Giving that the web server has access to the
metastore and table schema information. We could read blocks from HDFS
and format them appropriately, or select * query them and stream them
appropriately.

As you mentioned there may be some internal things that the open
source web interface will never be able to do. Hopefully we can
construct HWI in a way that plugging in new things is easy.

As I mentioned, I have been working offline trying to clean up the
hwi/build.xml file ( there is some redundant things in it ) and it can
be cleaned. This will later be needed to add the wikit jars to hwi/lib

In the past adding things like hive-history was done my adding a new
JSP and linking to it from the session_manage.jsp. We might need a new
paradigm now if we are going to implement an ajax all-in-one page type
thing.

To do this in an open way (if you want to help) we should open up a
Jira and post some mock up screen-shots of how the UI would look and
then talk about how we could implement the current/new features.

Edward