You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Ashish Thusoo (JIRA)" <ji...@apache.org> on 2008/07/09 01:14:32 UTC
[jira] Updated: (HADOOP-3601) Hive as a contrib project
[ https://issues.apache.org/jira/browse/HADOOP-3601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ashish Thusoo updated HADOOP-3601:
----------------------------------
Attachment: HiveTutorial.pdf
Tutorial on the capabilities of Hive. This is a pdf of internal documentation and contains query, dml and ddl examples as well as the overview of the system. A formal language spec, architecture documents and roadmaps will follow. This document gives the initial preview of the system and hopefully will seed a lot of interesting discussion/questions etc. around this system.
> Hive as a contrib project
> -------------------------
>
> Key: HADOOP-3601
> URL: https://issues.apache.org/jira/browse/HADOOP-3601
> Project: Hadoop Core
> Issue Type: New Feature
> Affects Versions: 0.17.0
> Reporter: Joydeep Sen Sarma
> Priority: Minor
> Attachments: HiveTutorial.pdf
>
> Original Estimate: 1080h
> Remaining Estimate: 1080h
>
> Hive is a data warehouse built on top of flat files (stored primarily in HDFS). It includes:
> - Data Organization into Tables with logical and hash partitioning
> - A Metastore to store metadata about Tables/Partitions etc
> - A SQL like query language over object data stored in Tables
> - DDL commands to define and load external data into tables
> Hive's query language is executed using Hadoop map-reduce as the execution engine. Queries can use either single stage or multi-stage map-reduce. Hive has a native format for tables - but can handle any data set (for example json/thrift/xml) using an IO library framework.
> Hive uses Antlr for query parsing, Apache JEXL for expression evaluation and may use Apache Derby as an embedded database for MetaStore. Antlr has a BSD license and should be compatible with Apache license.
> We are currently thinking of contributing to the 0.17 branch as a contrib project (since that is the version under which it will get tested internally) - but looking for advice on the best release path.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.