You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by Julian Hyde <ju...@hydromatic.net> on 2014/08/21 00:37:05 UTC

Exhibit: Optiq inside Hive

In a recent blog post, Josh Wills describes an interesting application of Optiq inside a Hive table-generating function:

http://blog.cloudera.com/blog/2014/08/how-to-count-events-like-a-data-scientist/

If you browse the code you’ll see that he implemented Optiq’s Schema and Table APIs based on Hive metadata, so that when running inside Hive, Optiq automatically sees all data that Hive can see.

https://github.com/jwills/exhibit

https://github.com/jwills/exhibit/tree/master/src/main/java/com/cloudera/exhibit/udtf

Julian


Re: Exhibit: Optiq inside Hive

Posted by Josh Wills <jw...@cloudera.com>.
Thanks Julian, I had a lot of fun doing it.

I'm trying to decide where to take it next-- it would be interesting to
apply the same SQL-within-SQL pattern to systems like HBase, MongoDB, etc.
that have the same sort of schema flexibility. For example, you could
imagine running a HBase coprocessor that re-executed SQL queries against
the contents of a row on every update in order to compute derived
quantities and/or trigger alerts.

J


On Wed, Aug 20, 2014 at 3:37 PM, Julian Hyde <ju...@hydromatic.net> wrote:

> In a recent blog post, Josh Wills describes an interesting application of
> Optiq inside a Hive table-generating function:
>
>
> http://blog.cloudera.com/blog/2014/08/how-to-count-events-like-a-data-scientist/
>
> If you browse the code you’ll see that he implemented Optiq’s Schema and
> Table APIs based on Hive metadata, so that when running inside Hive, Optiq
> automatically sees all data that Hive can see.
>
> https://github.com/jwills/exhibit
>
>
> https://github.com/jwills/exhibit/tree/master/src/main/java/com/cloudera/exhibit/udtf
>
> Julian
>
>


-- 
Director of Data Science
Cloudera <http://www.cloudera.com>
Twitter: @josh_wills <http://twitter.com/josh_wills>