You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kylin.apache.org by Yiming Liu <li...@gmail.com> on 2016/07/07 00:01:01 UTC

Re: Not Sure if my mails are bouncing , is it possible for the community to give me some direction ? //: Few Questions about Kylin Ability

Hi Santosh,

Most of your questions should be answered by experienced architect, not
from the community developers. Kylin is designed for OLAP on Hadoop
solution, from my experience,  it could fit in most of your scenes. I will
suggest you reading some documents from Kylin website, learning the
features and concepts first. If you have any further specific questions,
more people will be ready for help.  If some features are not ready,
discussion and JIRA are welcomed also.

2016-07-05 9:42 GMT+08:00 Santoshakhilesh <sa...@huawei.com>:

>
>
> -----Original Message-----
> From: Santosh Akhilesh [mailto:santoshakhilesh@gmail.com]
> Sent: 02 July 2016 17:55
> To: dev@kylin.apache.org
> Cc: Santoshakhilesh
> Subject: Few Questions about Kylin Ability
>
> Hi All ,
> Last year I had done a PoC for one of our products using Kylin. Our
> distributed architecture journey was on hold for some time but now we are
> back again to rearchitect our system to distributed. I am writing this mail
> to understand how and whether Kylin can fit in to our requirements.
> Let me give background of our requirement.
> Ours is a network performance management solution which needs to handle
> following scenes.
>
> 1. Collect data from network elements in granularity between 30 sec to 5
> minute period. Every period we collect around 150Million KPIs Which are
> distributed across different service type. The service types are model
> driven and can change over period of time.
> 2. Data which we collect needs to available for Adhoc and OLAP type query
> ASAP. For example data collected between 10:00 and 10:05 for 5 mins period
> should be available for reports to fire query by 10:06. Query will involve
> joining performance data with inventory data and also have filters like
> query data for Area = Area1 and we also need sort by KPI or property of
> inventory with order by Clause 3. We also need OLAP type query like group
> by area , province , country etc... and needs to apply sum , max , min ,
> avg aggregator. We also need to generate Top talkers report which means we
> need Top N function.
> 4. There will be background machine learning jobs which need to scan raw
> and aggregated data.
> 5. We would be generating around 5-10 TB of data every day and In future
> may be more.
> Now my questions are these. We need to retain data for several days and
> months based on aggregation period.
> 6. Adhoc and OLAP query from report should take < 2 seconds.
> So my questions are;
>
> 1. Which of the use cases Kylin can support?
> 2. How long cube building takes and how does it handle the data which will
> be appended every 30 sec or 5 minutes.
> 3. Can Kylin support both Adhoc query and OLAP query ?
>
> I have several other questions but I would like to initiate the discussion
> with these.
> We plan to start a test next week with Kylin I am just setting up a
> cluster now. We don't plan to use cloud era or Horton work sandbox as our
> company has its own sandbox.
>
> Appreciate response from Kylin experts.
>
> Regards
> Santosh
>
>
> Sent from my iPhone
>



-- 
With Warm regards

Yiming Liu (刘一鸣)

RE: Not Sure if my mails are bouncing , is it possible for the community to give me some direction ? //: Few Questions about Kylin Ability

Posted by Santoshakhilesh <sa...@huawei.com>.
Hi Yiming ,
     Thanks for answering my question. I had gone through Kylin docs last year and I am still in mailing list so I kind of passively know by the mails about Kylin progress.
    

      May be my mail sounded like asking for an architectural answer but my questions are related to ability of kylin.

      If someone has experience with similar use case then please let me know.

      1. My requirement is to build the cube very often in append mode (at most I can delay by 5  Mins , as I understand streaming OLAP is still PoC)
         So I want to know in general how long it takes to build a cube , I know its hard to answer without knowing my facts and dimensions but I just want a higher estimate of around 10G of data every 5 Minutes
      2. From the mails I recall its mentioned that cube building is recommended  through UI but for me I need programmatic control so whether REST interface can meet this requirement,. Can I have a control over progress of cube build programmatically.

      I know once cube is build Kylin is lightning fast for query so my main concern is about pre query process.

      So if anyone has experience with it , please drop a mail , thanks in advance.

Regards,
Santosh Akhilesh
 


-----Original Message-----
From: Yiming Liu [mailto:liuyiming.vip@gmail.com] 
Sent: 07 July 2016 08:01
To: dev
Subject: Re: Not Sure if my mails are bouncing , is it possible for the community to give me some direction ? //: Few Questions about Kylin Ability

Hi Santosh,

Most of your questions should be answered by experienced architect, not from the community developers. Kylin is designed for OLAP on Hadoop solution, from my experience,  it could fit in most of your scenes. I will suggest you reading some documents from Kylin website, learning the features and concepts first. If you have any further specific questions, more people will be ready for help.  If some features are not ready, discussion and JIRA are welcomed also.

2016-07-05 9:42 GMT+08:00 Santoshakhilesh <sa...@huawei.com>:

>
>
> -----Original Message-----
> From: Santosh Akhilesh [mailto:santoshakhilesh@gmail.com]
> Sent: 02 July 2016 17:55
> To: dev@kylin.apache.org
> Cc: Santoshakhilesh
> Subject: Few Questions about Kylin Ability
>
> Hi All ,
> Last year I had done a PoC for one of our products using Kylin. Our 
> distributed architecture journey was on hold for some time but now we 
> are back again to rearchitect our system to distributed. I am writing 
> this mail to understand how and whether Kylin can fit in to our requirements.
> Let me give background of our requirement.
> Ours is a network performance management solution which needs to 
> handle following scenes.
>
> 1. Collect data from network elements in granularity between 30 sec to 
> 5 minute period. Every period we collect around 150Million KPIs Which 
> are distributed across different service type. The service types are 
> model driven and can change over period of time.
> 2. Data which we collect needs to available for Adhoc and OLAP type 
> query ASAP. For example data collected between 10:00 and 10:05 for 5 
> mins period should be available for reports to fire query by 10:06. 
> Query will involve joining performance data with inventory data and 
> also have filters like query data for Area = Area1 and we also need 
> sort by KPI or property of inventory with order by Clause 3. We also 
> need OLAP type query like group by area , province , country etc... 
> and needs to apply sum , max , min , avg aggregator. We also need to 
> generate Top talkers report which means we need Top N function.
> 4. There will be background machine learning jobs which need to scan 
> raw and aggregated data.
> 5. We would be generating around 5-10 TB of data every day and In 
> future may be more.
> Now my questions are these. We need to retain data for several days 
> and months based on aggregation period.
> 6. Adhoc and OLAP query from report should take < 2 seconds.
> So my questions are;
>
> 1. Which of the use cases Kylin can support?
> 2. How long cube building takes and how does it handle the data which 
> will be appended every 30 sec or 5 minutes.
> 3. Can Kylin support both Adhoc query and OLAP query ?
>
> I have several other questions but I would like to initiate the 
> discussion with these.
> We plan to start a test next week with Kylin I am just setting up a 
> cluster now. We don't plan to use cloud era or Horton work sandbox as 
> our company has its own sandbox.
>
> Appreciate response from Kylin experts.
>
> Regards
> Santosh
>
>
> Sent from my iPhone
>



--
With Warm regards

Yiming Liu (刘一鸣)