You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by xiaohe lan <zo...@gmail.com> on 2015/04/02 06:30:03 UTC

Dataset for Hive

Hi All,

I am new to Hive. Just set up a 5 node Hadoop environment and want to have
a try on HiveQL.
Is there any dataset I can download to play HiveQL. The dataset should have
several tables some I can write some complex join. About 100G should be
fine.

Thanks,
Xiaohe

Re: Dataset for Hive

Posted by Chao Sun <su...@apache.org>.
Hi Xiaohe,

You can try TPC-DS from https://github.com/hortonworks/hive-testbench.
It contains large number of queries with complex joins.

Chao

On Wed, Apr 1, 2015 at 9:30 PM, xiaohe lan <zo...@gmail.com> wrote:

> Hi All,
>
> I am new to Hive. Just set up a 5 node Hadoop environment and want to have
> a try on HiveQL.
> Is there any dataset I can download to play HiveQL. The dataset should have
> several tables some I can write some complex join. About 100G should be
> fine.
>
> Thanks,
> Xiaohe
>

Re: Dataset for Hive

Posted by Chao Sun <su...@apache.org>.
Hi Xiaohe,

You can try TPC-DS from https://github.com/hortonworks/hive-testbench.
It contains large number of queries with complex joins.

Chao

On Wed, Apr 1, 2015 at 9:30 PM, xiaohe lan <zo...@gmail.com> wrote:

> Hi All,
>
> I am new to Hive. Just set up a 5 node Hadoop environment and want to have
> a try on HiveQL.
> Is there any dataset I can download to play HiveQL. The dataset should have
> several tables some I can write some complex join. About 100G should be
> fine.
>
> Thanks,
> Xiaohe
>