You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Sachit Murarka <co...@gmail.com> on 2017/12/31 03:28:23 UTC

Indexing

Hello,
I have seen some blog saying that Indexing is not recommended , instead we
can use ORC format. Can you please provide suggestion?
I could not see any official declaration.

Kind Regards,
Sachit Murarka

Re: Indexing

Posted by Jörn Franke <jo...@gmail.com>.
Hallo,

It always depends on your use case and you always should do performance tests to verify it fits your use cases. Hence, I doubt that you find a generic statement on the Hive site. Although most of the times the internal index of orc will have more advantages, such as less space usage.

Furthermore, orc (or parquet) require that the data is sorted on the filtering column. 

Hive provides also other relevant features, such as partitioning.

Best regards

> On 31. Dec 2017, at 04:28, Sachit Murarka <co...@gmail.com> wrote:
> 
> 
> Hello,
> I have seen some blog saying that Indexing is not recommended , instead we can use ORC format. Can you please provide suggestion?
> I could not see any official declaration.
> 
> Kind Regards,
> Sachit Murarka
>