You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@lens.apache.org by Long Zhou <lo...@gmail.com> on 2015/02/26 16:35:42 UTC

Choosing between Kylin and Lens

Hi Kylin and Lens communities,

    I am working on a big data analysis project and consider using Kylin or
Lens. Do you have some guidelines/recommendations on how to choose the
right solution? We are particularly interested in the performance
characteristics of these two solutions on terabytes of sparse data.
    I just started learning the two projects. It seems Kylin is more like
MOLAP while Lens is more like ROLAP, is that correct? Does the differences
between MOLAP and ROLAP apply here?
    When using Hive as storage, it seems Kylin might perform better since
data is pre-aggregated and cached. How does Kylin handle sparse tables and
avoid empty cells in cache? Does Lens have cache on top of Hive?
    Lens supports columnar data warehouses like Redshift. How much
performance could we gain by loading data to Redshift?
    Where can I find performance benchmark data for the two projects?

Best regards,
Long Zhou