You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-commits@hadoop.apache.org by Apache Wiki <wi...@apache.org> on 2008/09/04 23:39:05 UTC

[Hadoop Wiki] Update of "Hive" by petewyckoff

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The following page has been changed by petewyckoff:
http://wiki.apache.org/hadoop/Hive

The comment on the change is:
made clear hive can read/write data in any format

------------------------------------------------------------------------------
  
  = What is NOT Hive =
  Hive is based on Hadoop which is a batch processing system. Accordingly, this system does not and cannot promise low latencies on queries. The paradigm here is strictly of submitting jobs and being notified when the jobs are completed as opposed to real time queries. As a result it should not be compared with systems like Oracle where analysis is done on a significantly smaller amount of data but the analysis proceeds much more iteratively with the response times between iterations being less than a few minutes. For Hive queries response times for even the smallest jobs can be of the order of 5-10 minutes and for larger jobs this may even run into hours.
+ 
+ Hive does not mandate read or written data be in "hive format" - there is no such thing; Hive works equally well on Thrift, RecordIO, control delimited, or your data format.
+ 
  
  = Status =
  Hive has been submitted as a contrib project in hadoop trunk. The details of its availability are available at [https://issues.apache.org/jira/browse/HADOOP-3601 Hadoop JIRA]