You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kylin.apache.org by Jason Hale <ja...@koddi.com> on 2016/08/27 14:43:50 UTC

Reloading data

Does Kylin handle reloading data for particular dates?

If my data is at a granularity of day and I need to reload some data
because we initially received bad data for that day, how would we tell
Kylin this?

I know incremental refresh works for new data, but can we do something
similar with existing data so the cube doesn't have to be entirely rebuilt?

Re: Reloading data

Posted by Alberto Ramón <a....@gmail.com>.
Ok, thanks

I don't now if the property *version  *of HBase, can be useful to ingest
smaller "data late"

Alb

2016-08-28 1:34 GMT+02:00 Li Yang <li...@apache.org>:

> User can refresh an existing cube segment to update data changes in the
> range of that segment.
>
> Managing the granularity of segment is important if you expect late data.
> Keep recent data in small segments, so when data update comes, you only
> have to refresh a (or some) small segments.
>
> Cheers
> Yang
>
> On Sun, Aug 28, 2016 at 1:13 AM, Alberto Ramón <a....@gmail.com>
> wrote:
>
> > I'm also I terested in this question:
> >
> > how can I manage late data?
> > If all aggregates are lineal (like sums, counts, ...). Do I need
> > recalculate all slot data?
> >
> > BR
> >
> > El 27/8/2016 16:43, "Jason Hale" <ja...@koddi.com> escribió:
> >
> > Does Kylin handle reloading data for particular dates?
> >
> > If my data is at a granularity of day and I need to reload some data
> > because we initially received bad data for that day, how would we tell
> > Kylin this?
> >
> > I know incremental refresh works for new data, but can we do something
> > similar with existing data so the cube doesn't have to be entirely
> rebuilt?
> >
>

Re: Reloading data

Posted by Li Yang <li...@apache.org>.
User can refresh an existing cube segment to update data changes in the
range of that segment.

Managing the granularity of segment is important if you expect late data.
Keep recent data in small segments, so when data update comes, you only
have to refresh a (or some) small segments.

Cheers
Yang

On Sun, Aug 28, 2016 at 1:13 AM, Alberto Ramón <a....@gmail.com>
wrote:

> I'm also I terested in this question:
>
> how can I manage late data?
> If all aggregates are lineal (like sums, counts, ...). Do I need
> recalculate all slot data?
>
> BR
>
> El 27/8/2016 16:43, "Jason Hale" <ja...@koddi.com> escribió:
>
> Does Kylin handle reloading data for particular dates?
>
> If my data is at a granularity of day and I need to reload some data
> because we initially received bad data for that day, how would we tell
> Kylin this?
>
> I know incremental refresh works for new data, but can we do something
> similar with existing data so the cube doesn't have to be entirely rebuilt?
>

Re: Reloading data

Posted by Alberto Ramón <a....@gmail.com>.
I'm also I terested in this question:

how can I manage late data?
If all aggregates are lineal (like sums, counts, ...). Do I need
recalculate all slot data?

BR

El 27/8/2016 16:43, "Jason Hale" <ja...@koddi.com> escribió:

Does Kylin handle reloading data for particular dates?

If my data is at a granularity of day and I need to reload some data
because we initially received bad data for that day, how would we tell
Kylin this?

I know incremental refresh works for new data, but can we do something
similar with existing data so the cube doesn't have to be entirely rebuilt?