You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@kylin.apache.org by Nirav Patel <np...@xactlycorp.com> on 2017/05/11 18:18:23 UTC

Read write isolation; Availability of cube for query while rebuild/refresh

Hi,

I understand currently kylin does't support partial changes to existing
cube data. In which case entire cube has to be rebuild. WHat is the impact
of it on clients/query interface? Do they have to wait when cube is getting
refreshed?
Also what happens during incremental refresh? If some client query for new
data which are being built would kylin allow dirty read on cube that is
being built?

Thanks,
Nirav

-- 


[image: What's New with Xactly] <http://www.xactlycorp.com/email-click/>

<https://www.nyse.com/quote/XNYS:XTLY>  [image: LinkedIn] 
<https://www.linkedin.com/company/xactly-corporation>  [image: Twitter] 
<https://twitter.com/Xactly>  [image: Facebook] 
<https://www.facebook.com/XactlyCorp>  [image: YouTube] 
<http://www.youtube.com/xactlycorporation>

Re: Read write isolation; Availability of cube for query while rebuild/refresh

Posted by Billy Liu <bi...@apache.org>.

There is no dirty read issue. If the new data(actually the new segment) is
not cube ready, user could not query that data. It's transparent to client
query. The query will not wait for the building/refreshing cube. The
incremental build is to build new segment.

2017-05-12 2:18 GMT+08:00 Nirav Patel <np...@xactlycorp.com>:

> Hi,
>
> I understand currently kylin does't support partial changes to existing
> cube data. In which case entire cube has to be rebuild. WHat is the impact
> of it on clients/query interface? Do they have to wait when cube is getting
> refreshed?
> Also what happens during incremental refresh? If some client query for new
> data which are being built would kylin allow dirty read on cube that is
> being built?
>
> Thanks,
> Nirav
>
>
>
> [image: What's New with Xactly] <http://www.xactlycorp.com/email-click/>
>
> <https://www.nyse.com/quote/XNYS:XTLY>  [image: LinkedIn]
> <https://www.linkedin.com/company/xactly-corporation>  [image: Twitter]
> <https://twitter.com/Xactly>  [image: Facebook]
> <https://www.facebook.com/XactlyCorp>  [image: YouTube]
> <http://www.youtube.com/xactlycorporation>

Re: Read write isolation; Availability of cube for query while rebuild/refresh

Posted by Nirav Patel <np...@xactlycorp.com>.

Thanks. Is there also some concept of automatic snapshot when cube is under
build or refresh state?

Is it possible to have all these design decision documented on website?
This along with design of Metadata Engine, Indexes, Segments, Query router.
It would help us to understand and evaluate kylin better.

Cheers



On Thu, May 11, 2017 at 7:19 PM, Billy Liu <bi...@apache.org> wrote:

> If multiple cubes could answer the same query, such as the clone ones,
> Kylin will route the query to the cube who has the lowest query cost. The
> query cost is computed by dimensions complexity, not query latency.
>
> 2017-05-12 9:23 GMT+08:00 Nirav Patel <np...@xactlycorp.com>:
>
> > Is it achieve via following steps?
> >
> >
> >    1. Clone the cube
> >    2. Make changes to clone
> >    3. Rebuild clone
> >    4. Enable clone
> >    5. Disable original cube so that kylin will redirect queries to new
> >    Clone cubes?
> >
> >
> > But in that interim time when both clones and original cubes are
> available
> > on same hive tables how kylin know which one to pick? based on query
> > metadata? dimensions, aggregations etc?
> >
> > Thanks
> >
> >
> > On Thu, May 11, 2017 at 11:18 AM, Nirav Patel <np...@xactlycorp.com>
> > wrote:
> >
> > > Hi,
> > >
> > > I understand currently kylin does't support partial changes to existing
> > > cube data. In which case entire cube has to be rebuild. WHat is the
> > impact
> > > of it on clients/query interface? Do they have to wait when cube is
> > getting
> > > refreshed?
> > > Also what happens during incremental refresh? If some client query for
> > new
> > > data which are being built would kylin allow dirty read on cube that is
> > > being built?
> > >
> > > Thanks,
> > > Nirav
> > >
> >
> > --
> >
> >
> > [image: What's New with Xactly] <http://www.xactlycorp.com/email-click/>
> >
> > <https://www.nyse.com/quote/XNYS:XTLY>  [image: LinkedIn]
> > <https://www.linkedin.com/company/xactly-corporation>  [image: Twitter]
> > <https://twitter.com/Xactly>  [image: Facebook]
> > <https://www.facebook.com/XactlyCorp>  [image: YouTube]
> > <http://www.youtube.com/xactlycorporation>
> >
>

-- 


[image: What's New with Xactly] <http://www.xactlycorp.com/email-click/>

<https://www.nyse.com/quote/XNYS:XTLY>  [image: LinkedIn] 
<https://www.linkedin.com/company/xactly-corporation>  [image: Twitter] 
<https://twitter.com/Xactly>  [image: Facebook] 
<https://www.facebook.com/XactlyCorp>  [image: YouTube] 
<http://www.youtube.com/xactlycorporation>

Re: Read write isolation; Availability of cube for query while rebuild/refresh

Posted by Billy Liu <bi...@apache.org>.

If multiple cubes could answer the same query, such as the clone ones,
Kylin will route the query to the cube who has the lowest query cost. The
query cost is computed by dimensions complexity, not query latency.

2017-05-12 9:23 GMT+08:00 Nirav Patel <np...@xactlycorp.com>:

> Is it achieve via following steps?
>
>
>    1. Clone the cube
>    2. Make changes to clone
>    3. Rebuild clone
>    4. Enable clone
>    5. Disable original cube so that kylin will redirect queries to new
>    Clone cubes?
>
>
> But in that interim time when both clones and original cubes are available
> on same hive tables how kylin know which one to pick? based on query
> metadata? dimensions, aggregations etc?
>
> Thanks
>
>
> On Thu, May 11, 2017 at 11:18 AM, Nirav Patel <np...@xactlycorp.com>
> wrote:
>
> > Hi,
> >
> > I understand currently kylin does't support partial changes to existing
> > cube data. In which case entire cube has to be rebuild. WHat is the
> impact
> > of it on clients/query interface? Do they have to wait when cube is
> getting
> > refreshed?
> > Also what happens during incremental refresh? If some client query for
> new
> > data which are being built would kylin allow dirty read on cube that is
> > being built?
> >
> > Thanks,
> > Nirav
> >
>
> --
>
>
> [image: What's New with Xactly] <http://www.xactlycorp.com/email-click/>
>
> <https://www.nyse.com/quote/XNYS:XTLY>  [image: LinkedIn]
> <https://www.linkedin.com/company/xactly-corporation>  [image: Twitter]
> <https://twitter.com/Xactly>  [image: Facebook]
> <https://www.facebook.com/XactlyCorp>  [image: YouTube]
> <http://www.youtube.com/xactlycorporation>
>

Re: Read write isolation; Availability of cube for query while rebuild/refresh

Posted by Nirav Patel <np...@xactlycorp.com>.

Is it achieve via following steps?

   1. Clone the cube
   2. Make changes to clone
   3. Rebuild clone
   4. Enable clone
   5. Disable original cube so that kylin will redirect queries to new
   Clone cubes?

But in that interim time when both clones and original cubes are available
on same hive tables how kylin know which one to pick? based on query
metadata? dimensions, aggregations etc?

Thanks

On Thu, May 11, 2017 at 11:18 AM, Nirav Patel <np...@xactlycorp.com> wrote:

> Hi,
>
> I understand currently kylin does't support partial changes to existing
> cube data. In which case entire cube has to be rebuild. WHat is the impact
> of it on clients/query interface? Do they have to wait when cube is getting
> refreshed?
> Also what happens during incremental refresh? If some client query for new
> data which are being built would kylin allow dirty read on cube that is
> being built?
>
> Thanks,
> Nirav
>

-- 

[image: What's New with Xactly] <http://www.xactlycorp.com/email-click/>

<https://www.nyse.com/quote/XNYS:XTLY>  [image: LinkedIn] 
<https://www.linkedin.com/company/xactly-corporation>  [image: Twitter] 
<https://twitter.com/Xactly>  [image: Facebook] 
<https://www.facebook.com/XactlyCorp>  [image: YouTube] 
<http://www.youtube.com/xactlycorporation>