You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kylin.apache.org by Vineet Mishra <cl...@gmail.com> on 2015/06/11 12:38:03 UTC

Schedule Cube Creation

Hi,

Is there a way so that I can schedule the cube creation by purging the
older cube and creating the new cube everyday taking the same window
interval say around a month or so.

So my requirement is pretty straightforward, I want to build a cube
considering for the last one month data set and which should refresh every
day with last one month data from the particular date.

Thanks!

Re: Schedule Cube Creation

Posted by dong wang <el...@gmail.com>.
Hi Vineet Mishra, to achieve your goal, I think you can use SHELL script
with KYLIN's REST API~

2015-06-17 18:34 GMT+08:00 Vineet Mishra <cl...@gmail.com>:

> Any update on this?
>
> Thanks,
>
> On Thu, Jun 11, 2015 at 4:08 PM, Vineet Mishra <cl...@gmail.com>
> wrote:
>
> > Hi,
> >
> > Is there a way so that I can schedule the cube creation by purging the
> > older cube and creating the new cube everyday taking the same window
> > interval say around a month or so.
> >
> > So my requirement is pretty straightforward, I want to build a cube
> > considering for the last one month data set and which should refresh
> every
> > day with last one month data from the particular date.
> >
> > Thanks!
> >
>

Re: Schedule Cube Creation

Posted by Vineet Mishra <cl...@gmail.com>.
Any update on this?

Thanks,

On Thu, Jun 11, 2015 at 4:08 PM, Vineet Mishra <cl...@gmail.com>
wrote:

> Hi,
>
> Is there a way so that I can schedule the cube creation by purging the
> older cube and creating the new cube everyday taking the same window
> interval say around a month or so.
>
> So my requirement is pretty straightforward, I want to build a cube
> considering for the last one month data set and which should refresh every
> day with last one month data from the particular date.
>
> Thanks!
>

Re: Schedule Cube Creation

Posted by Vineet Mishra <cl...@gmail.com>.
Any update on this?

On Tue, Jun 30, 2015 at 6:00 PM, Vineet Mishra <cl...@gmail.com>
wrote:

> Hi,
>
> Runnning/Scheduling multiple jobs at once is killing all the other jobs
> except only one.
>
> So I have three cubes and to build cube I have corresponding build jobs,
> Its failing at third step Build Dimension Dictionary with FileNotFound
> exception with
>
> java.io.FileNotFoundException: File does not exist:
> /tmp/kylin-65abae6a-72e0-4b59-880b-ece8ab49b33b/sc_sd_esd_diff1/fact_distinct_columns/cn
>
> java.io.FileNotFoundException: File does not exist:
> /tmp/kylin-b9145673-de15-4304-8c99-431618219c28/sc_o2s_metrics_verified/fact_distinct_columns/sc
>
> Any suggestions would be highly appreciated!
>
> Thanks,
>
> On Wed, Jun 24, 2015 at 9:32 PM, Vineet Mishra <cl...@gmail.com>
> wrote:
>
>> Thanks Shi!
>>
>> On Wed, Jun 24, 2015 at 12:59 PM, Shi, Shaofeng <sh...@ebay.com> wrote:
>>
>>> Yes purge can also be requested via REST API, see the API list:
>>>
>>>
>>> https://github.com/apache/incubator-kylin/blob/master/docs/REST/Kylin%20Res
>>> tful%20API%20List.md
>>>
>>>
>>> On 6/24/15, 3:07 PM, "Vineet Mishra" <cl...@gmail.com> wrote:
>>>
>>> >Hi Shi,
>>> >
>>> >For my use case, its like the data can change throughout from the very
>>> >initial for every next day as the hive table is truncate load from
>>> scratch
>>> >and the process is meant to be like that. I guess in that case purging
>>> the
>>> >cube and rebuilding from the scratch would be better and only option.
>>> >
>>> >Referring to the mentioned url for the kylin api
>>> >
>>> >
>>> https://github.com/apache/incubator-kylin/blob/master/docs/REST/Build%20Cu
>>> >be%20with%20Restful%20API.md
>>> >
>>> >is it even possible to purge the cube and rebuild through the api, as I
>>> >can
>>> >only see build and merge option mentioned in the api.
>>> >
>>> >Thanks!
>>> >
>>> >
>>> >On Wed, Jun 24, 2015 at 8:13 AM, Shi, Shaofeng <sh...@ebay.com>
>>> wrote:
>>> >
>>> >> Hi Vineet,
>>> >>
>>> >> It can vary depends on your scenario:
>>> >>
>>> >> Say you have build the data from 23 May (inclusive) to 23 June
>>> >> (exclusive); and Now the data of 23 June loads into hive; If the
>>> >>historic
>>> >> data (23 May to 23 June) in hive will not change, you don’t need to
>>> >>build
>>> >> that again; You just need build a new date range from 23 to 24 June;
>>> >>After
>>> >> the build, there will be two cube “segments”: one is for the past
>>> month,
>>> >> and the second is for the 23 June to 24; we call this as “incremental
>>> >> build”; Kylin will scan all cube segments (each segment is a hbase
>>> >>table)
>>> >> when executing a SQL query, so with a big full build or multiple
>>> >> incremental builds you will get the same query result; We suggest use
>>> >> incremental build as that will save resource/time on the cube build;
>>> >>
>>> >> But if your data in hive will change and you expects cube data be sync
>>> >> with hive, you need refresh the historic cube segment, or rebuilt the
>>> >> whole data range each time;
>>> >>
>>> >>
>>> >> On 6/23/15, 5:40 PM, "Vineet Mishra" <cl...@gmail.com> wrote:
>>> >>
>>> >> >Hi Shi,
>>> >> >
>>> >> >Referring to the above link, I want to refresh the cube for each
>>> day's
>>> >> >corresponding last month's data.
>>> >> >
>>> >> >So my requirement is something like today I wan't to build the cube
>>> for
>>> >> >last one month data that is from 23 May to 23 June and tomorrow I
>>> will
>>> >>be
>>> >> >requiring the cube for the date range of 24 May to 24 June and so on.
>>> >> >
>>> >> >Can you shadow me as in that case how to use the mentioned API and
>>> move
>>> >> >forward? Will it still be considering Refresh build or full cube
>>> build.
>>> >> >
>>> >> >Thanks!
>>> >> >
>>> >> >On Tue, Jun 23, 2015 at 1:30 AM, Vineet Mishra <
>>> clearmidoubt@gmail.com>
>>> >> >wrote:
>>> >> >
>>> >> >> Hi Shi,
>>> >> >>
>>> >> >> I am not aware off the basic authentication mechanism mentioned
>>> here,
>>> >> >> could you help me out as how could I schedule my cube refresh using
>>> >>the
>>> >> >>API.
>>> >> >>
>>> >> >> I tried java URL Connection for basic authentication but couldn't
>>> get
>>> >> >>any
>>> >> >> cookies to move further.
>>> >> >>
>>> >> >> Thanks,
>>> >> >>
>>> >> >> On Wed, Jun 17, 2015 at 7:21 PM, Shi, Shaofeng <sh...@ebay.com>
>>> >> wrote:
>>> >> >>
>>> >> >>> You don¹t need repeatedly create the cube; We call it ³BUILD² or
>>> >> >>> ³REFRESH²: ³BUILD² is to build a new segment; ³REFRESH² is to
>>> >>update an
>>> >> >>> existing cube segment, please check:
>>> >> >>>
>>> >> >>>
>>> >> >>>
>>> >> >>>
>>> >>
>>> https://github.com/apache/incubator-kylin/blob/master/docs/REST/Build%20
>>> >> >>>Cub
>>> >> >>> e%20with%20Restful%20API.md
>>> >> >>>
>>> >> >>>
>>> >> >>> On 6/11/15, 6:38 PM, "Vineet Mishra" <cl...@gmail.com>
>>> wrote:
>>> >> >>>
>>> >> >>> >Hi,
>>> >> >>> >
>>> >> >>> >Is there a way so that I can schedule the cube creation by
>>> purging
>>> >>the
>>> >> >>> >older cube and creating the new cube everyday taking the same
>>> >>window
>>> >> >>> >interval say around a month or so.
>>> >> >>> >
>>> >> >>> >So my requirement is pretty straightforward, I want to build a
>>> cube
>>> >> >>> >considering for the last one month data set and which should
>>> >>refresh
>>> >> >>> every
>>> >> >>> >day with last one month data from the particular date.
>>> >> >>> >
>>> >> >>> >Thanks!
>>> >> >>>
>>> >> >>>
>>> >> >>
>>> >>
>>> >>
>>>
>>>
>>
>

Re: Schedule Cube Creation

Posted by Vineet Mishra <cl...@gmail.com>.
Thanks Shi!

The table was accidentally empty, it was the reason for the same.

On Wed, Jul 1, 2015 at 8:09 AM, Shi, Shaofeng <sh...@ebay.com> wrote:

> This error usually caused by there is no record in the selected time
> range; You can verify this by checking the “data size” of the first job
> step, if it is very small that is the case; Since no data in the flat hive
> table, in the second step there will be no distinct values be output, then
> in the third step it will report file not found error; Please check your
> hive table and selected date range;
>
> On 6/30/15, 8:30 PM, "Vineet Mishra" <cl...@gmail.com> wrote:
>
> >Hi,
> >
> >Runnning/Scheduling multiple jobs at once is killing all the other jobs
> >except only one.
> >
> >So I have three cubes and to build cube I have corresponding build jobs,
> >Its failing at third step Build Dimension Dictionary with FileNotFound
> >exception with
> >
> >java.io.FileNotFoundException: File does not exist:
> >/tmp/kylin-65abae6a-72e0-4b59-880b-ece8ab49b33b/sc_sd_esd_diff1/fact_disti
> >nct_columns/cn
> >
> >java.io.FileNotFoundException: File does not exist:
> >/tmp/kylin-b9145673-de15-4304-8c99-431618219c28/sc_o2s_metrics_verified/fa
> >ct_distinct_columns/sc
> >
> >Any suggestions would be highly appreciated!
> >
> >Thanks,
> >
> >On Wed, Jun 24, 2015 at 9:32 PM, Vineet Mishra <cl...@gmail.com>
> >wrote:
> >
> >> Thanks Shi!
> >>
> >> On Wed, Jun 24, 2015 at 12:59 PM, Shi, Shaofeng <sh...@ebay.com>
> >>wrote:
> >>
> >>> Yes purge can also be requested via REST API, see the API list:
> >>>
> >>>
> >>>
> >>>
> https://github.com/apache/incubator-kylin/blob/master/docs/REST/Kylin%20
> >>>Res
> >>> tful%20API%20List.md
> >>>
> >>>
> >>> On 6/24/15, 3:07 PM, "Vineet Mishra" <cl...@gmail.com> wrote:
> >>>
> >>> >Hi Shi,
> >>> >
> >>> >For my use case, its like the data can change throughout from the very
> >>> >initial for every next day as the hive table is truncate load from
> >>> scratch
> >>> >and the process is meant to be like that. I guess in that case purging
> >>> the
> >>> >cube and rebuilding from the scratch would be better and only option.
> >>> >
> >>> >Referring to the mentioned url for the kylin api
> >>> >
> >>> >
> >>>
> >>>
> https://github.com/apache/incubator-kylin/blob/master/docs/REST/Build%20
> >>>Cu
> >>> >be%20with%20Restful%20API.md
> >>> >
> >>> >is it even possible to purge the cube and rebuild through the api, as
> >>>I
> >>> >can
> >>> >only see build and merge option mentioned in the api.
> >>> >
> >>> >Thanks!
> >>> >
> >>> >
> >>> >On Wed, Jun 24, 2015 at 8:13 AM, Shi, Shaofeng <sh...@ebay.com>
> >>>wrote:
> >>> >
> >>> >> Hi Vineet,
> >>> >>
> >>> >> It can vary depends on your scenario:
> >>> >>
> >>> >> Say you have build the data from 23 May (inclusive) to 23 June
> >>> >> (exclusive); and Now the data of 23 June loads into hive; If the
> >>> >>historic
> >>> >> data (23 May to 23 June) in hive will not change, you don’t need to
> >>> >>build
> >>> >> that again; You just need build a new date range from 23 to 24 June;
> >>> >>After
> >>> >> the build, there will be two cube “segments”: one is for the past
> >>> month,
> >>> >> and the second is for the 23 June to 24; we call this as
> >>>“incremental
> >>> >> build”; Kylin will scan all cube segments (each segment is a hbase
> >>> >>table)
> >>> >> when executing a SQL query, so with a big full build or multiple
> >>> >> incremental builds you will get the same query result; We suggest
> >>>use
> >>> >> incremental build as that will save resource/time on the cube build;
> >>> >>
> >>> >> But if your data in hive will change and you expects cube data be
> >>>sync
> >>> >> with hive, you need refresh the historic cube segment, or rebuilt
> >>>the
> >>> >> whole data range each time;
> >>> >>
> >>> >>
> >>> >> On 6/23/15, 5:40 PM, "Vineet Mishra" <cl...@gmail.com>
> wrote:
> >>> >>
> >>> >> >Hi Shi,
> >>> >> >
> >>> >> >Referring to the above link, I want to refresh the cube for each
> >>>day's
> >>> >> >corresponding last month's data.
> >>> >> >
> >>> >> >So my requirement is something like today I wan't to build the cube
> >>> for
> >>> >> >last one month data that is from 23 May to 23 June and tomorrow I
> >>>will
> >>> >>be
> >>> >> >requiring the cube for the date range of 24 May to 24 June and so
> >>>on.
> >>> >> >
> >>> >> >Can you shadow me as in that case how to use the mentioned API and
> >>> move
> >>> >> >forward? Will it still be considering Refresh build or full cube
> >>> build.
> >>> >> >
> >>> >> >Thanks!
> >>> >> >
> >>> >> >On Tue, Jun 23, 2015 at 1:30 AM, Vineet Mishra <
> >>> clearmidoubt@gmail.com>
> >>> >> >wrote:
> >>> >> >
> >>> >> >> Hi Shi,
> >>> >> >>
> >>> >> >> I am not aware off the basic authentication mechanism mentioned
> >>> here,
> >>> >> >> could you help me out as how could I schedule my cube refresh
> >>>using
> >>> >>the
> >>> >> >>API.
> >>> >> >>
> >>> >> >> I tried java URL Connection for basic authentication but couldn't
> >>> get
> >>> >> >>any
> >>> >> >> cookies to move further.
> >>> >> >>
> >>> >> >> Thanks,
> >>> >> >>
> >>> >> >> On Wed, Jun 17, 2015 at 7:21 PM, Shi, Shaofeng <shaoshi@ebay.com
> >
> >>> >> wrote:
> >>> >> >>
> >>> >> >>> You don¹t need repeatedly create the cube; We call it ³BUILD² or
> >>> >> >>> ³REFRESH²: ³BUILD² is to build a new segment; ³REFRESH² is to
> >>> >>update an
> >>> >> >>> existing cube segment, please check:
> >>> >> >>>
> >>> >> >>>
> >>> >> >>>
> >>> >> >>>
> >>> >>
> >>>
> >>>
> https://github.com/apache/incubator-kylin/blob/master/docs/REST/Build%20
> >>> >> >>>Cub
> >>> >> >>> e%20with%20Restful%20API.md
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> On 6/11/15, 6:38 PM, "Vineet Mishra" <cl...@gmail.com>
> >>> wrote:
> >>> >> >>>
> >>> >> >>> >Hi,
> >>> >> >>> >
> >>> >> >>> >Is there a way so that I can schedule the cube creation by
> >>>purging
> >>> >>the
> >>> >> >>> >older cube and creating the new cube everyday taking the same
> >>> >>window
> >>> >> >>> >interval say around a month or so.
> >>> >> >>> >
> >>> >> >>> >So my requirement is pretty straightforward, I want to build a
> >>> cube
> >>> >> >>> >considering for the last one month data set and which should
> >>> >>refresh
> >>> >> >>> every
> >>> >> >>> >day with last one month data from the particular date.
> >>> >> >>> >
> >>> >> >>> >Thanks!
> >>> >> >>>
> >>> >> >>>
> >>> >> >>
> >>> >>
> >>> >>
> >>>
> >>>
> >>
>
>

Re: Schedule Cube Creation

Posted by "Shi, Shaofeng" <sh...@ebay.com>.
This error usually caused by there is no record in the selected time
range; You can verify this by checking the “data size” of the first job
step, if it is very small that is the case; Since no data in the flat hive
table, in the second step there will be no distinct values be output, then
in the third step it will report file not found error; Please check your
hive table and selected date range;

On 6/30/15, 8:30 PM, "Vineet Mishra" <cl...@gmail.com> wrote:

>Hi,
>
>Runnning/Scheduling multiple jobs at once is killing all the other jobs
>except only one.
>
>So I have three cubes and to build cube I have corresponding build jobs,
>Its failing at third step Build Dimension Dictionary with FileNotFound
>exception with
>
>java.io.FileNotFoundException: File does not exist:
>/tmp/kylin-65abae6a-72e0-4b59-880b-ece8ab49b33b/sc_sd_esd_diff1/fact_disti
>nct_columns/cn
>
>java.io.FileNotFoundException: File does not exist:
>/tmp/kylin-b9145673-de15-4304-8c99-431618219c28/sc_o2s_metrics_verified/fa
>ct_distinct_columns/sc
>
>Any suggestions would be highly appreciated!
>
>Thanks,
>
>On Wed, Jun 24, 2015 at 9:32 PM, Vineet Mishra <cl...@gmail.com>
>wrote:
>
>> Thanks Shi!
>>
>> On Wed, Jun 24, 2015 at 12:59 PM, Shi, Shaofeng <sh...@ebay.com>
>>wrote:
>>
>>> Yes purge can also be requested via REST API, see the API list:
>>>
>>>
>>> 
>>>https://github.com/apache/incubator-kylin/blob/master/docs/REST/Kylin%20
>>>Res
>>> tful%20API%20List.md
>>>
>>>
>>> On 6/24/15, 3:07 PM, "Vineet Mishra" <cl...@gmail.com> wrote:
>>>
>>> >Hi Shi,
>>> >
>>> >For my use case, its like the data can change throughout from the very
>>> >initial for every next day as the hive table is truncate load from
>>> scratch
>>> >and the process is meant to be like that. I guess in that case purging
>>> the
>>> >cube and rebuilding from the scratch would be better and only option.
>>> >
>>> >Referring to the mentioned url for the kylin api
>>> >
>>> >
>>> 
>>>https://github.com/apache/incubator-kylin/blob/master/docs/REST/Build%20
>>>Cu
>>> >be%20with%20Restful%20API.md
>>> >
>>> >is it even possible to purge the cube and rebuild through the api, as
>>>I
>>> >can
>>> >only see build and merge option mentioned in the api.
>>> >
>>> >Thanks!
>>> >
>>> >
>>> >On Wed, Jun 24, 2015 at 8:13 AM, Shi, Shaofeng <sh...@ebay.com>
>>>wrote:
>>> >
>>> >> Hi Vineet,
>>> >>
>>> >> It can vary depends on your scenario:
>>> >>
>>> >> Say you have build the data from 23 May (inclusive) to 23 June
>>> >> (exclusive); and Now the data of 23 June loads into hive; If the
>>> >>historic
>>> >> data (23 May to 23 June) in hive will not change, you don’t need to
>>> >>build
>>> >> that again; You just need build a new date range from 23 to 24 June;
>>> >>After
>>> >> the build, there will be two cube “segments”: one is for the past
>>> month,
>>> >> and the second is for the 23 June to 24; we call this as
>>>“incremental
>>> >> build”; Kylin will scan all cube segments (each segment is a hbase
>>> >>table)
>>> >> when executing a SQL query, so with a big full build or multiple
>>> >> incremental builds you will get the same query result; We suggest
>>>use
>>> >> incremental build as that will save resource/time on the cube build;
>>> >>
>>> >> But if your data in hive will change and you expects cube data be
>>>sync
>>> >> with hive, you need refresh the historic cube segment, or rebuilt
>>>the
>>> >> whole data range each time;
>>> >>
>>> >>
>>> >> On 6/23/15, 5:40 PM, "Vineet Mishra" <cl...@gmail.com> wrote:
>>> >>
>>> >> >Hi Shi,
>>> >> >
>>> >> >Referring to the above link, I want to refresh the cube for each
>>>day's
>>> >> >corresponding last month's data.
>>> >> >
>>> >> >So my requirement is something like today I wan't to build the cube
>>> for
>>> >> >last one month data that is from 23 May to 23 June and tomorrow I
>>>will
>>> >>be
>>> >> >requiring the cube for the date range of 24 May to 24 June and so
>>>on.
>>> >> >
>>> >> >Can you shadow me as in that case how to use the mentioned API and
>>> move
>>> >> >forward? Will it still be considering Refresh build or full cube
>>> build.
>>> >> >
>>> >> >Thanks!
>>> >> >
>>> >> >On Tue, Jun 23, 2015 at 1:30 AM, Vineet Mishra <
>>> clearmidoubt@gmail.com>
>>> >> >wrote:
>>> >> >
>>> >> >> Hi Shi,
>>> >> >>
>>> >> >> I am not aware off the basic authentication mechanism mentioned
>>> here,
>>> >> >> could you help me out as how could I schedule my cube refresh
>>>using
>>> >>the
>>> >> >>API.
>>> >> >>
>>> >> >> I tried java URL Connection for basic authentication but couldn't
>>> get
>>> >> >>any
>>> >> >> cookies to move further.
>>> >> >>
>>> >> >> Thanks,
>>> >> >>
>>> >> >> On Wed, Jun 17, 2015 at 7:21 PM, Shi, Shaofeng <sh...@ebay.com>
>>> >> wrote:
>>> >> >>
>>> >> >>> You don¹t need repeatedly create the cube; We call it ³BUILD² or
>>> >> >>> ³REFRESH²: ³BUILD² is to build a new segment; ³REFRESH² is to
>>> >>update an
>>> >> >>> existing cube segment, please check:
>>> >> >>>
>>> >> >>>
>>> >> >>>
>>> >> >>>
>>> >>
>>> 
>>>https://github.com/apache/incubator-kylin/blob/master/docs/REST/Build%20
>>> >> >>>Cub
>>> >> >>> e%20with%20Restful%20API.md
>>> >> >>>
>>> >> >>>
>>> >> >>> On 6/11/15, 6:38 PM, "Vineet Mishra" <cl...@gmail.com>
>>> wrote:
>>> >> >>>
>>> >> >>> >Hi,
>>> >> >>> >
>>> >> >>> >Is there a way so that I can schedule the cube creation by
>>>purging
>>> >>the
>>> >> >>> >older cube and creating the new cube everyday taking the same
>>> >>window
>>> >> >>> >interval say around a month or so.
>>> >> >>> >
>>> >> >>> >So my requirement is pretty straightforward, I want to build a
>>> cube
>>> >> >>> >considering for the last one month data set and which should
>>> >>refresh
>>> >> >>> every
>>> >> >>> >day with last one month data from the particular date.
>>> >> >>> >
>>> >> >>> >Thanks!
>>> >> >>>
>>> >> >>>
>>> >> >>
>>> >>
>>> >>
>>>
>>>
>>


Re: Schedule Cube Creation

Posted by Vineet Mishra <cl...@gmail.com>.
Hi,

Runnning/Scheduling multiple jobs at once is killing all the other jobs
except only one.

So I have three cubes and to build cube I have corresponding build jobs,
Its failing at third step Build Dimension Dictionary with FileNotFound
exception with

java.io.FileNotFoundException: File does not exist:
/tmp/kylin-65abae6a-72e0-4b59-880b-ece8ab49b33b/sc_sd_esd_diff1/fact_distinct_columns/cn

java.io.FileNotFoundException: File does not exist:
/tmp/kylin-b9145673-de15-4304-8c99-431618219c28/sc_o2s_metrics_verified/fact_distinct_columns/sc

Any suggestions would be highly appreciated!

Thanks,

On Wed, Jun 24, 2015 at 9:32 PM, Vineet Mishra <cl...@gmail.com>
wrote:

> Thanks Shi!
>
> On Wed, Jun 24, 2015 at 12:59 PM, Shi, Shaofeng <sh...@ebay.com> wrote:
>
>> Yes purge can also be requested via REST API, see the API list:
>>
>>
>> https://github.com/apache/incubator-kylin/blob/master/docs/REST/Kylin%20Res
>> tful%20API%20List.md
>>
>>
>> On 6/24/15, 3:07 PM, "Vineet Mishra" <cl...@gmail.com> wrote:
>>
>> >Hi Shi,
>> >
>> >For my use case, its like the data can change throughout from the very
>> >initial for every next day as the hive table is truncate load from
>> scratch
>> >and the process is meant to be like that. I guess in that case purging
>> the
>> >cube and rebuilding from the scratch would be better and only option.
>> >
>> >Referring to the mentioned url for the kylin api
>> >
>> >
>> https://github.com/apache/incubator-kylin/blob/master/docs/REST/Build%20Cu
>> >be%20with%20Restful%20API.md
>> >
>> >is it even possible to purge the cube and rebuild through the api, as I
>> >can
>> >only see build and merge option mentioned in the api.
>> >
>> >Thanks!
>> >
>> >
>> >On Wed, Jun 24, 2015 at 8:13 AM, Shi, Shaofeng <sh...@ebay.com> wrote:
>> >
>> >> Hi Vineet,
>> >>
>> >> It can vary depends on your scenario:
>> >>
>> >> Say you have build the data from 23 May (inclusive) to 23 June
>> >> (exclusive); and Now the data of 23 June loads into hive; If the
>> >>historic
>> >> data (23 May to 23 June) in hive will not change, you don’t need to
>> >>build
>> >> that again; You just need build a new date range from 23 to 24 June;
>> >>After
>> >> the build, there will be two cube “segments”: one is for the past
>> month,
>> >> and the second is for the 23 June to 24; we call this as “incremental
>> >> build”; Kylin will scan all cube segments (each segment is a hbase
>> >>table)
>> >> when executing a SQL query, so with a big full build or multiple
>> >> incremental builds you will get the same query result; We suggest use
>> >> incremental build as that will save resource/time on the cube build;
>> >>
>> >> But if your data in hive will change and you expects cube data be sync
>> >> with hive, you need refresh the historic cube segment, or rebuilt the
>> >> whole data range each time;
>> >>
>> >>
>> >> On 6/23/15, 5:40 PM, "Vineet Mishra" <cl...@gmail.com> wrote:
>> >>
>> >> >Hi Shi,
>> >> >
>> >> >Referring to the above link, I want to refresh the cube for each day's
>> >> >corresponding last month's data.
>> >> >
>> >> >So my requirement is something like today I wan't to build the cube
>> for
>> >> >last one month data that is from 23 May to 23 June and tomorrow I will
>> >>be
>> >> >requiring the cube for the date range of 24 May to 24 June and so on.
>> >> >
>> >> >Can you shadow me as in that case how to use the mentioned API and
>> move
>> >> >forward? Will it still be considering Refresh build or full cube
>> build.
>> >> >
>> >> >Thanks!
>> >> >
>> >> >On Tue, Jun 23, 2015 at 1:30 AM, Vineet Mishra <
>> clearmidoubt@gmail.com>
>> >> >wrote:
>> >> >
>> >> >> Hi Shi,
>> >> >>
>> >> >> I am not aware off the basic authentication mechanism mentioned
>> here,
>> >> >> could you help me out as how could I schedule my cube refresh using
>> >>the
>> >> >>API.
>> >> >>
>> >> >> I tried java URL Connection for basic authentication but couldn't
>> get
>> >> >>any
>> >> >> cookies to move further.
>> >> >>
>> >> >> Thanks,
>> >> >>
>> >> >> On Wed, Jun 17, 2015 at 7:21 PM, Shi, Shaofeng <sh...@ebay.com>
>> >> wrote:
>> >> >>
>> >> >>> You don¹t need repeatedly create the cube; We call it ³BUILD² or
>> >> >>> ³REFRESH²: ³BUILD² is to build a new segment; ³REFRESH² is to
>> >>update an
>> >> >>> existing cube segment, please check:
>> >> >>>
>> >> >>>
>> >> >>>
>> >> >>>
>> >>
>> https://github.com/apache/incubator-kylin/blob/master/docs/REST/Build%20
>> >> >>>Cub
>> >> >>> e%20with%20Restful%20API.md
>> >> >>>
>> >> >>>
>> >> >>> On 6/11/15, 6:38 PM, "Vineet Mishra" <cl...@gmail.com>
>> wrote:
>> >> >>>
>> >> >>> >Hi,
>> >> >>> >
>> >> >>> >Is there a way so that I can schedule the cube creation by purging
>> >>the
>> >> >>> >older cube and creating the new cube everyday taking the same
>> >>window
>> >> >>> >interval say around a month or so.
>> >> >>> >
>> >> >>> >So my requirement is pretty straightforward, I want to build a
>> cube
>> >> >>> >considering for the last one month data set and which should
>> >>refresh
>> >> >>> every
>> >> >>> >day with last one month data from the particular date.
>> >> >>> >
>> >> >>> >Thanks!
>> >> >>>
>> >> >>>
>> >> >>
>> >>
>> >>
>>
>>
>

Re: Schedule Cube Creation

Posted by Vineet Mishra <cl...@gmail.com>.
Thanks Shi!

On Wed, Jun 24, 2015 at 12:59 PM, Shi, Shaofeng <sh...@ebay.com> wrote:

> Yes purge can also be requested via REST API, see the API list:
>
> https://github.com/apache/incubator-kylin/blob/master/docs/REST/Kylin%20Res
> tful%20API%20List.md
>
>
> On 6/24/15, 3:07 PM, "Vineet Mishra" <cl...@gmail.com> wrote:
>
> >Hi Shi,
> >
> >For my use case, its like the data can change throughout from the very
> >initial for every next day as the hive table is truncate load from scratch
> >and the process is meant to be like that. I guess in that case purging the
> >cube and rebuilding from the scratch would be better and only option.
> >
> >Referring to the mentioned url for the kylin api
> >
> >
> https://github.com/apache/incubator-kylin/blob/master/docs/REST/Build%20Cu
> >be%20with%20Restful%20API.md
> >
> >is it even possible to purge the cube and rebuild through the api, as I
> >can
> >only see build and merge option mentioned in the api.
> >
> >Thanks!
> >
> >
> >On Wed, Jun 24, 2015 at 8:13 AM, Shi, Shaofeng <sh...@ebay.com> wrote:
> >
> >> Hi Vineet,
> >>
> >> It can vary depends on your scenario:
> >>
> >> Say you have build the data from 23 May (inclusive) to 23 June
> >> (exclusive); and Now the data of 23 June loads into hive; If the
> >>historic
> >> data (23 May to 23 June) in hive will not change, you don’t need to
> >>build
> >> that again; You just need build a new date range from 23 to 24 June;
> >>After
> >> the build, there will be two cube “segments”: one is for the past month,
> >> and the second is for the 23 June to 24; we call this as “incremental
> >> build”; Kylin will scan all cube segments (each segment is a hbase
> >>table)
> >> when executing a SQL query, so with a big full build or multiple
> >> incremental builds you will get the same query result; We suggest use
> >> incremental build as that will save resource/time on the cube build;
> >>
> >> But if your data in hive will change and you expects cube data be sync
> >> with hive, you need refresh the historic cube segment, or rebuilt the
> >> whole data range each time;
> >>
> >>
> >> On 6/23/15, 5:40 PM, "Vineet Mishra" <cl...@gmail.com> wrote:
> >>
> >> >Hi Shi,
> >> >
> >> >Referring to the above link, I want to refresh the cube for each day's
> >> >corresponding last month's data.
> >> >
> >> >So my requirement is something like today I wan't to build the cube for
> >> >last one month data that is from 23 May to 23 June and tomorrow I will
> >>be
> >> >requiring the cube for the date range of 24 May to 24 June and so on.
> >> >
> >> >Can you shadow me as in that case how to use the mentioned API and move
> >> >forward? Will it still be considering Refresh build or full cube build.
> >> >
> >> >Thanks!
> >> >
> >> >On Tue, Jun 23, 2015 at 1:30 AM, Vineet Mishra <clearmidoubt@gmail.com
> >
> >> >wrote:
> >> >
> >> >> Hi Shi,
> >> >>
> >> >> I am not aware off the basic authentication mechanism mentioned here,
> >> >> could you help me out as how could I schedule my cube refresh using
> >>the
> >> >>API.
> >> >>
> >> >> I tried java URL Connection for basic authentication but couldn't get
> >> >>any
> >> >> cookies to move further.
> >> >>
> >> >> Thanks,
> >> >>
> >> >> On Wed, Jun 17, 2015 at 7:21 PM, Shi, Shaofeng <sh...@ebay.com>
> >> wrote:
> >> >>
> >> >>> You don¹t need repeatedly create the cube; We call it ³BUILD² or
> >> >>> ³REFRESH²: ³BUILD² is to build a new segment; ³REFRESH² is to
> >>update an
> >> >>> existing cube segment, please check:
> >> >>>
> >> >>>
> >> >>>
> >> >>>
> >>
> https://github.com/apache/incubator-kylin/blob/master/docs/REST/Build%20
> >> >>>Cub
> >> >>> e%20with%20Restful%20API.md
> >> >>>
> >> >>>
> >> >>> On 6/11/15, 6:38 PM, "Vineet Mishra" <cl...@gmail.com>
> wrote:
> >> >>>
> >> >>> >Hi,
> >> >>> >
> >> >>> >Is there a way so that I can schedule the cube creation by purging
> >>the
> >> >>> >older cube and creating the new cube everyday taking the same
> >>window
> >> >>> >interval say around a month or so.
> >> >>> >
> >> >>> >So my requirement is pretty straightforward, I want to build a cube
> >> >>> >considering for the last one month data set and which should
> >>refresh
> >> >>> every
> >> >>> >day with last one month data from the particular date.
> >> >>> >
> >> >>> >Thanks!
> >> >>>
> >> >>>
> >> >>
> >>
> >>
>
>

Re: Schedule Cube Creation

Posted by "Shi, Shaofeng" <sh...@ebay.com>.
Yes purge can also be requested via REST API, see the API list:

https://github.com/apache/incubator-kylin/blob/master/docs/REST/Kylin%20Res
tful%20API%20List.md


On 6/24/15, 3:07 PM, "Vineet Mishra" <cl...@gmail.com> wrote:

>Hi Shi,
>
>For my use case, its like the data can change throughout from the very
>initial for every next day as the hive table is truncate load from scratch
>and the process is meant to be like that. I guess in that case purging the
>cube and rebuilding from the scratch would be better and only option.
>
>Referring to the mentioned url for the kylin api
>
>https://github.com/apache/incubator-kylin/blob/master/docs/REST/Build%20Cu
>be%20with%20Restful%20API.md
>
>is it even possible to purge the cube and rebuild through the api, as I
>can
>only see build and merge option mentioned in the api.
>
>Thanks!
>
>
>On Wed, Jun 24, 2015 at 8:13 AM, Shi, Shaofeng <sh...@ebay.com> wrote:
>
>> Hi Vineet,
>>
>> It can vary depends on your scenario:
>>
>> Say you have build the data from 23 May (inclusive) to 23 June
>> (exclusive); and Now the data of 23 June loads into hive; If the
>>historic
>> data (23 May to 23 June) in hive will not change, you don’t need to
>>build
>> that again; You just need build a new date range from 23 to 24 June;
>>After
>> the build, there will be two cube “segments”: one is for the past month,
>> and the second is for the 23 June to 24; we call this as “incremental
>> build”; Kylin will scan all cube segments (each segment is a hbase
>>table)
>> when executing a SQL query, so with a big full build or multiple
>> incremental builds you will get the same query result; We suggest use
>> incremental build as that will save resource/time on the cube build;
>>
>> But if your data in hive will change and you expects cube data be sync
>> with hive, you need refresh the historic cube segment, or rebuilt the
>> whole data range each time;
>>
>>
>> On 6/23/15, 5:40 PM, "Vineet Mishra" <cl...@gmail.com> wrote:
>>
>> >Hi Shi,
>> >
>> >Referring to the above link, I want to refresh the cube for each day's
>> >corresponding last month's data.
>> >
>> >So my requirement is something like today I wan't to build the cube for
>> >last one month data that is from 23 May to 23 June and tomorrow I will
>>be
>> >requiring the cube for the date range of 24 May to 24 June and so on.
>> >
>> >Can you shadow me as in that case how to use the mentioned API and move
>> >forward? Will it still be considering Refresh build or full cube build.
>> >
>> >Thanks!
>> >
>> >On Tue, Jun 23, 2015 at 1:30 AM, Vineet Mishra <cl...@gmail.com>
>> >wrote:
>> >
>> >> Hi Shi,
>> >>
>> >> I am not aware off the basic authentication mechanism mentioned here,
>> >> could you help me out as how could I schedule my cube refresh using
>>the
>> >>API.
>> >>
>> >> I tried java URL Connection for basic authentication but couldn't get
>> >>any
>> >> cookies to move further.
>> >>
>> >> Thanks,
>> >>
>> >> On Wed, Jun 17, 2015 at 7:21 PM, Shi, Shaofeng <sh...@ebay.com>
>> wrote:
>> >>
>> >>> You don¹t need repeatedly create the cube; We call it ³BUILD² or
>> >>> ³REFRESH²: ³BUILD² is to build a new segment; ³REFRESH² is to
>>update an
>> >>> existing cube segment, please check:
>> >>>
>> >>>
>> >>>
>> >>>
>> https://github.com/apache/incubator-kylin/blob/master/docs/REST/Build%20
>> >>>Cub
>> >>> e%20with%20Restful%20API.md
>> >>>
>> >>>
>> >>> On 6/11/15, 6:38 PM, "Vineet Mishra" <cl...@gmail.com> wrote:
>> >>>
>> >>> >Hi,
>> >>> >
>> >>> >Is there a way so that I can schedule the cube creation by purging
>>the
>> >>> >older cube and creating the new cube everyday taking the same
>>window
>> >>> >interval say around a month or so.
>> >>> >
>> >>> >So my requirement is pretty straightforward, I want to build a cube
>> >>> >considering for the last one month data set and which should
>>refresh
>> >>> every
>> >>> >day with last one month data from the particular date.
>> >>> >
>> >>> >Thanks!
>> >>>
>> >>>
>> >>
>>
>>


Re: Schedule Cube Creation

Posted by Vineet Mishra <cl...@gmail.com>.
Hi Shi,

For my use case, its like the data can change throughout from the very
initial for every next day as the hive table is truncate load from scratch
and the process is meant to be like that. I guess in that case purging the
cube and rebuilding from the scratch would be better and only option.

Referring to the mentioned url for the kylin api

https://github.com/apache/incubator-kylin/blob/master/docs/REST/Build%20Cube%20with%20Restful%20API.md

is it even possible to purge the cube and rebuild through the api, as I can
only see build and merge option mentioned in the api.

Thanks!


On Wed, Jun 24, 2015 at 8:13 AM, Shi, Shaofeng <sh...@ebay.com> wrote:

> Hi Vineet,
>
> It can vary depends on your scenario:
>
> Say you have build the data from 23 May (inclusive) to 23 June
> (exclusive); and Now the data of 23 June loads into hive; If the historic
> data (23 May to 23 June) in hive will not change, you don’t need to build
> that again; You just need build a new date range from 23 to 24 June; After
> the build, there will be two cube “segments”: one is for the past month,
> and the second is for the 23 June to 24; we call this as “incremental
> build”; Kylin will scan all cube segments (each segment is a hbase table)
> when executing a SQL query, so with a big full build or multiple
> incremental builds you will get the same query result; We suggest use
> incremental build as that will save resource/time on the cube build;
>
> But if your data in hive will change and you expects cube data be sync
> with hive, you need refresh the historic cube segment, or rebuilt the
> whole data range each time;
>
>
> On 6/23/15, 5:40 PM, "Vineet Mishra" <cl...@gmail.com> wrote:
>
> >Hi Shi,
> >
> >Referring to the above link, I want to refresh the cube for each day's
> >corresponding last month's data.
> >
> >So my requirement is something like today I wan't to build the cube for
> >last one month data that is from 23 May to 23 June and tomorrow I will be
> >requiring the cube for the date range of 24 May to 24 June and so on.
> >
> >Can you shadow me as in that case how to use the mentioned API and move
> >forward? Will it still be considering Refresh build or full cube build.
> >
> >Thanks!
> >
> >On Tue, Jun 23, 2015 at 1:30 AM, Vineet Mishra <cl...@gmail.com>
> >wrote:
> >
> >> Hi Shi,
> >>
> >> I am not aware off the basic authentication mechanism mentioned here,
> >> could you help me out as how could I schedule my cube refresh using the
> >>API.
> >>
> >> I tried java URL Connection for basic authentication but couldn't get
> >>any
> >> cookies to move further.
> >>
> >> Thanks,
> >>
> >> On Wed, Jun 17, 2015 at 7:21 PM, Shi, Shaofeng <sh...@ebay.com>
> wrote:
> >>
> >>> You don¹t need repeatedly create the cube; We call it ³BUILD² or
> >>> ³REFRESH²: ³BUILD² is to build a new segment; ³REFRESH² is to update an
> >>> existing cube segment, please check:
> >>>
> >>>
> >>>
> >>>
> https://github.com/apache/incubator-kylin/blob/master/docs/REST/Build%20
> >>>Cub
> >>> e%20with%20Restful%20API.md
> >>>
> >>>
> >>> On 6/11/15, 6:38 PM, "Vineet Mishra" <cl...@gmail.com> wrote:
> >>>
> >>> >Hi,
> >>> >
> >>> >Is there a way so that I can schedule the cube creation by purging the
> >>> >older cube and creating the new cube everyday taking the same window
> >>> >interval say around a month or so.
> >>> >
> >>> >So my requirement is pretty straightforward, I want to build a cube
> >>> >considering for the last one month data set and which should refresh
> >>> every
> >>> >day with last one month data from the particular date.
> >>> >
> >>> >Thanks!
> >>>
> >>>
> >>
>
>

Re: Schedule Cube Creation

Posted by "Shi, Shaofeng" <sh...@ebay.com>.
Hi Vineet,

It can vary depends on your scenario:

Say you have build the data from 23 May (inclusive) to 23 June
(exclusive); and Now the data of 23 June loads into hive; If the historic
data (23 May to 23 June) in hive will not change, you don’t need to build
that again; You just need build a new date range from 23 to 24 June; After
the build, there will be two cube “segments”: one is for the past month,
and the second is for the 23 June to 24; we call this as “incremental
build”; Kylin will scan all cube segments (each segment is a hbase table)
when executing a SQL query, so with a big full build or multiple
incremental builds you will get the same query result; We suggest use
incremental build as that will save resource/time on the cube build;

But if your data in hive will change and you expects cube data be sync
with hive, you need refresh the historic cube segment, or rebuilt the
whole data range each time;


On 6/23/15, 5:40 PM, "Vineet Mishra" <cl...@gmail.com> wrote:

>Hi Shi,
>
>Referring to the above link, I want to refresh the cube for each day's
>corresponding last month's data.
>
>So my requirement is something like today I wan't to build the cube for
>last one month data that is from 23 May to 23 June and tomorrow I will be
>requiring the cube for the date range of 24 May to 24 June and so on.
>
>Can you shadow me as in that case how to use the mentioned API and move
>forward? Will it still be considering Refresh build or full cube build.
>
>Thanks!
>
>On Tue, Jun 23, 2015 at 1:30 AM, Vineet Mishra <cl...@gmail.com>
>wrote:
>
>> Hi Shi,
>>
>> I am not aware off the basic authentication mechanism mentioned here,
>> could you help me out as how could I schedule my cube refresh using the
>>API.
>>
>> I tried java URL Connection for basic authentication but couldn't get
>>any
>> cookies to move further.
>>
>> Thanks,
>>
>> On Wed, Jun 17, 2015 at 7:21 PM, Shi, Shaofeng <sh...@ebay.com> wrote:
>>
>>> You don¹t need repeatedly create the cube; We call it ³BUILD² or
>>> ³REFRESH²: ³BUILD² is to build a new segment; ³REFRESH² is to update an
>>> existing cube segment, please check:
>>>
>>>
>>> 
>>>https://github.com/apache/incubator-kylin/blob/master/docs/REST/Build%20
>>>Cub
>>> e%20with%20Restful%20API.md
>>>
>>>
>>> On 6/11/15, 6:38 PM, "Vineet Mishra" <cl...@gmail.com> wrote:
>>>
>>> >Hi,
>>> >
>>> >Is there a way so that I can schedule the cube creation by purging the
>>> >older cube and creating the new cube everyday taking the same window
>>> >interval say around a month or so.
>>> >
>>> >So my requirement is pretty straightforward, I want to build a cube
>>> >considering for the last one month data set and which should refresh
>>> every
>>> >day with last one month data from the particular date.
>>> >
>>> >Thanks!
>>>
>>>
>>


Re: Schedule Cube Creation

Posted by Vineet Mishra <cl...@gmail.com>.
Hi Shi,

Referring to the above link, I want to refresh the cube for each day's
corresponding last month's data.

So my requirement is something like today I wan't to build the cube for
last one month data that is from 23 May to 23 June and tomorrow I will be
requiring the cube for the date range of 24 May to 24 June and so on.

Can you shadow me as in that case how to use the mentioned API and move
forward? Will it still be considering Refresh build or full cube build.

Thanks!

On Tue, Jun 23, 2015 at 1:30 AM, Vineet Mishra <cl...@gmail.com>
wrote:

> Hi Shi,
>
> I am not aware off the basic authentication mechanism mentioned here,
> could you help me out as how could I schedule my cube refresh using the API.
>
> I tried java URL Connection for basic authentication but couldn't get any
> cookies to move further.
>
> Thanks,
>
> On Wed, Jun 17, 2015 at 7:21 PM, Shi, Shaofeng <sh...@ebay.com> wrote:
>
>> You don¹t need repeatedly create the cube; We call it ³BUILD² or
>> ³REFRESH²: ³BUILD² is to build a new segment; ³REFRESH² is to update an
>> existing cube segment, please check:
>>
>>
>> https://github.com/apache/incubator-kylin/blob/master/docs/REST/Build%20Cub
>> e%20with%20Restful%20API.md
>>
>>
>> On 6/11/15, 6:38 PM, "Vineet Mishra" <cl...@gmail.com> wrote:
>>
>> >Hi,
>> >
>> >Is there a way so that I can schedule the cube creation by purging the
>> >older cube and creating the new cube everyday taking the same window
>> >interval say around a month or so.
>> >
>> >So my requirement is pretty straightforward, I want to build a cube
>> >considering for the last one month data set and which should refresh
>> every
>> >day with last one month data from the particular date.
>> >
>> >Thanks!
>>
>>
>

Re: Schedule Cube Creation

Posted by Vineet Mishra <cl...@gmail.com>.
Hi Shi,

I am not aware off the basic authentication mechanism mentioned here, could
you help me out as how could I schedule my cube refresh using the API.

I tried java URL Connection for basic authentication but couldn't get any
cookies to move further.

Thanks,

On Wed, Jun 17, 2015 at 7:21 PM, Shi, Shaofeng <sh...@ebay.com> wrote:

> You don¹t need repeatedly create the cube; We call it ³BUILD² or
> ³REFRESH²: ³BUILD² is to build a new segment; ³REFRESH² is to update an
> existing cube segment, please check:
>
> https://github.com/apache/incubator-kylin/blob/master/docs/REST/Build%20Cub
> e%20with%20Restful%20API.md
>
>
> On 6/11/15, 6:38 PM, "Vineet Mishra" <cl...@gmail.com> wrote:
>
> >Hi,
> >
> >Is there a way so that I can schedule the cube creation by purging the
> >older cube and creating the new cube everyday taking the same window
> >interval say around a month or so.
> >
> >So my requirement is pretty straightforward, I want to build a cube
> >considering for the last one month data set and which should refresh every
> >day with last one month data from the particular date.
> >
> >Thanks!
>
>

Re: Schedule Cube Creation

Posted by "Shi, Shaofeng" <sh...@ebay.com>.
You don¹t need repeatedly create the cube; We call it ³BUILD² or
³REFRESH²: ³BUILD² is to build a new segment; ³REFRESH² is to update an
existing cube segment, please check:

https://github.com/apache/incubator-kylin/blob/master/docs/REST/Build%20Cub
e%20with%20Restful%20API.md
 

On 6/11/15, 6:38 PM, "Vineet Mishra" <cl...@gmail.com> wrote:

>Hi,
>
>Is there a way so that I can schedule the cube creation by purging the
>older cube and creating the new cube everyday taking the same window
>interval say around a month or so.
>
>So my requirement is pretty straightforward, I want to build a cube
>considering for the last one month data set and which should refresh every
>day with last one month data from the particular date.
>
>Thanks!