You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by 易剑 <my...@gmail.com> on 2010/02/04 09:40:52 UTC

Use DTS instead of DFS for data warehouse

*Glossary*
DTS: Distributed Table System, not a bigtable
DFS: Distributed File System


DFS is better for unstructed data, but DTS is better for structed data, data
warehouse is structed, so I think a table is better than a file. DTS is
following:
1. Break a logic big table into a many physical small table
2. The same size blocks is not necessary
3. The order of blocks is not  necessary
4. Only store structed data
5. Support block indexes
6. Support deleting and updating
7. The interfaces are SQL, but only a block
8. Spliting a table horizontally and vertically is supported at the same
time
9. 。。。

Re: Use DTS instead of DFS for data warehouse

Posted by jian yi <ej...@gmail.com>.
>
> Hi hammer,
>

   Thank you. I don't know Zebra, but it sounds very good.

>
>
> 发件人: Jeff Hammerbacher <ha...@cloudera.com>
> 日期: 2010年2月5日 上午4:33
> 主题: Re: Use DTS instead of DFS for data warehouse
> 收件人: hdfs-dev@hadoop.apache.org
>
>
>
> Hey 易剑,
>
> Your proposed system sounds quite a bit like Zebra, which is a contributed
> project under the Pig subproject: http://wiki.apache.org/pig/zebra. Have
> you
> taken a look at Zebra?
>
> Thanks,
> Jeff
>
> 2010/2/4 易剑 <my...@gmail.com>
>
> > *Glossary*
> > DTS: Distributed Table System, not a bigtable
> > DFS: Distributed File System
> >
> >
> > DFS is better for unstructed data, but DTS is better for structed data,
> > data
> > warehouse is structed, so I think a table is better than a file. DTS is
> > following:
> > 1. Break a logic big table into a many physical small table
> > 2. The same size blocks is not necessary
> > 3. The order of blocks is not  necessary
> > 4. Only store structed data
> > 5. Support block indexes
> > 6. Support deleting and updating
> > 7. The interfaces are SQL, but only a block
> > 8. Spliting a table horizontally and vertically is supported at the same
> > time
> > 9. 。。。
> >
>
>

Re: Use DTS instead of DFS for data warehouse

Posted by Jeff Hammerbacher <ha...@cloudera.com>.
Hey 易剑,

Your proposed system sounds quite a bit like Zebra, which is a contributed
project under the Pig subproject: http://wiki.apache.org/pig/zebra. Have you
taken a look at Zebra?

Thanks,
Jeff

2010/2/4 易剑 <my...@gmail.com>

> *Glossary*
> DTS: Distributed Table System, not a bigtable
> DFS: Distributed File System
>
>
> DFS is better for unstructed data, but DTS is better for structed data,
> data
> warehouse is structed, so I think a table is better than a file. DTS is
> following:
> 1. Break a logic big table into a many physical small table
> 2. The same size blocks is not necessary
> 3. The order of blocks is not  necessary
> 4. Only store structed data
> 5. Support block indexes
> 6. Support deleting and updating
> 7. The interfaces are SQL, but only a block
> 8. Spliting a table horizontally and vertically is supported at the same
> time
> 9. 。。。
>