You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@pig.apache.org by Apache Wiki <wi...@apache.org> on 2009/08/12 00:48:36 UTC

[Pig Wiki] Update of "zebra" by jaytang

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Pig Wiki" for change notification.

The following page has been changed by jaytang:
http://wiki.apache.org/pig/zebra

New page:
#format wiki
#language en
#pragma section-numbers off

= Apache Pig-Zebra Wiki =

Zebra is a storage layer that provides a high level data access abstraction and a tabular view of data in Hadoop, and could free Pig users from implementing their own data storage/retrieval code. It provites

  * columnar storage format for fast data projection
  * schema language to manage physical storage metadata
  * CPU/space-efficient data serialization 

In the future, it could also support predicate pushdown for further performance improvement. Initially, Zebra is released as a contrib project in Pig and can become a hadoop subproject later on.