You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@kylin.apache.org by Li Yang <li...@apache.org> on 2015/07/06 08:16:44 UTC

Kylin upgraded to Calcite 1.0

KYLIN-780 is finally resolved. The latest branch of 0.7 & 0.8 now uses
Calcite 1.0.

This means Kylin's SQL capability has stepped up to a higher level. E.g.
the support of window function should be very close. (It maybe already
supported, I just haven't tested yet.)


Cheers
Yang

Re: Kylin upgraded to Calcite 1.0

Posted by Jacques Nadeau <ja...@apache.org>.

Congrats, this is great news!

A number of us were also waiting for this to start exploring Kylin/Drill
integration opportunities.

On Sun, Jul 5, 2015 at 11:16 PM, Li Yang <li...@apache.org> wrote:

> KYLIN-780 is finally resolved. The latest branch of 0.7 & 0.8 now uses
> Calcite 1.0.
>
> This means Kylin's SQL capability has stepped up to a higher level. E.g.
> the support of window function should be very close. (It maybe already
> supported, I just haven't tested yet.)
>
>
> Cheers
> Yang
>

Re: Kylin 0.7.1 - Failed to build a cube

Posted by 周千昊 <z....@gmail.com>.

Hi, gaspare
     kylin has an assumption that dimension table is small enough to fit in
memory so that the corresponding directiory should contains only one file.
     So as a workaround, you can merge these files into one single file, so
that kylin will be able to read from it

<ga...@gfmintegration.it>于2015年7月7日周二 下午6:42写道：

> Hi,
>
> I am trying to create a cube from a star schema created using Hive
> External tables (below an example) stored as TEXT FILE (CSV).
>
> CREATE EXTERNAL TABLE IF NOT EXISTS USERS_TABLE  (
>    uid INT,
>    name STRING
> )
> ROW FORMAT DELIMITED FIELDS TERMINATED BY '\073' LINES TERMINATED BY '\012'
> STORED AS TEXTFILE
> LOCATION '/data/users';
>
>
> To CSV files are obtained from Spark RDDs, so they are saved as part-xxxx.
> Below the HDFS listing
>
> hdfs dfs -ls /data/users
> Found 12 items
> -rw-r--r--   3 hdfs hdfs          0 2015-07-07 12:05 /data/users/_SUCCESS
> -rw-r--r--   3 hdfs hdfs    3699360 2015-07-07 12:05 /data/users/part-00000
> -rw-r--r--   3 hdfs hdfs    3694740 2015-07-07 12:05 /data/users/part-00001
> -rw-r--r--   3 hdfs hdfs    3685374 2015-07-07 12:05 /data/users/part-00002
> -rw-r--r--   3 hdfs hdfs    3719646 2015-07-07 12:05 /data/users/part-00003
> -rw-r--r--   3 hdfs hdfs    3682476 2015-07-07 12:05 /data/users/part-00004
> -rw-r--r--   3 hdfs hdfs    3679956 2015-07-07 12:05 /data/users/part-00005
> -rw-r--r--   3 hdfs hdfs    3700242 2015-07-07 12:05 /data/users/part-00006
> -rw-r--r--   3 hdfs hdfs    3672186 2015-07-07 12:05 /data/users/part-00007
> -rw-r--r--   3 hdfs hdfs    3682350 2015-07-07 12:05 /data/users/part-00008
> -rw-r--r--   3 hdfs hdfs    3680292 2015-07-07 12:05 /data/users/part-00009
> -rw-r--r--   3 hdfs hdfs    3697722 2015-07-07 12:05 /data/users/part-00010
>
> The CUBE build JOB fails when try to build the Dimension Dictionary with
> the following exception (it seems that the Hive Table data directory MUST
> contain only one file)
>
> java.lang.IllegalStateException: Expect 1 and only 1 non-zero file under
> hdfs://gas.gfmintegration.it:8020/data/cdr/bb/dimensions/users, but find
> 11
>         at
> org.apache.kylin.dict.lookup.HiveTable.findOnlyFile(HiveTable.java:123)
>         at
> org.apache.kylin.dict.lookup.HiveTable.computeHDFSLocation(HiveTable.java:107)
>         at
> org.apache.kylin.dict.lookup.HiveTable.getHDFSLocation(HiveTable.java:83)
>         at
> org.apache.kylin.dict.lookup.HiveTable.getFileTable(HiveTable.java:76)
>         at
> org.apache.kylin.dict.lookup.HiveTable.getSignature(HiveTable.java:71)
>         at
> org.apache.kylin.dict.DictionaryManager.buildDictionary(DictionaryManager.java:164)
>         at
> org.apache.kylin.cube.CubeManager.buildDictionary(CubeManager.java:154)
>         at
> org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:53)
>         at
> org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:42)
>         at
> org.apache.kylin.job.hadoop.dict.CreateDictionaryJob.run(CreateDictionaryJob.java:53)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>         at
> org.apache.kylin.job.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:63)
>         at
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:107)
>         at
> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:50)
>         at
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:107)
>         at
> org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:132)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:745)
>
> result code:2
>
>
> Do you have any indications on how to create a proper Hive star schema for
> Kylin?
>
> I would like to use external tables (stored as CSV, parquet files or
> HBase) because I need to process the same data also from Spark.
>
> Thanks in advance.
>
> BR,
>
> -- gas
>
>
>
>

Re: Kylin 0.7.1 - Failed to build a cube

Posted by 周千昊 <z....@gmail.com>.

Hi, gaspare
     kylin has an assumption that dimension table is small enough to fit in
memory so that the corresponding directiory should contains only one file.
     So as a workaround, you can merge these files into one single file, so
that kylin will be able to read from it

<ga...@gfmintegration.it>于2015年7月7日周二 下午6:42写道：

> Hi,
>
> I am trying to create a cube from a star schema created using Hive
> External tables (below an example) stored as TEXT FILE (CSV).
>
> CREATE EXTERNAL TABLE IF NOT EXISTS USERS_TABLE  (
>    uid INT,
>    name STRING
> )
> ROW FORMAT DELIMITED FIELDS TERMINATED BY '\073' LINES TERMINATED BY '\012'
> STORED AS TEXTFILE
> LOCATION '/data/users';
>
>
> To CSV files are obtained from Spark RDDs, so they are saved as part-xxxx.
> Below the HDFS listing
>
> hdfs dfs -ls /data/users
> Found 12 items
> -rw-r--r--   3 hdfs hdfs          0 2015-07-07 12:05 /data/users/_SUCCESS
> -rw-r--r--   3 hdfs hdfs    3699360 2015-07-07 12:05 /data/users/part-00000
> -rw-r--r--   3 hdfs hdfs    3694740 2015-07-07 12:05 /data/users/part-00001
> -rw-r--r--   3 hdfs hdfs    3685374 2015-07-07 12:05 /data/users/part-00002
> -rw-r--r--   3 hdfs hdfs    3719646 2015-07-07 12:05 /data/users/part-00003
> -rw-r--r--   3 hdfs hdfs    3682476 2015-07-07 12:05 /data/users/part-00004
> -rw-r--r--   3 hdfs hdfs    3679956 2015-07-07 12:05 /data/users/part-00005
> -rw-r--r--   3 hdfs hdfs    3700242 2015-07-07 12:05 /data/users/part-00006
> -rw-r--r--   3 hdfs hdfs    3672186 2015-07-07 12:05 /data/users/part-00007
> -rw-r--r--   3 hdfs hdfs    3682350 2015-07-07 12:05 /data/users/part-00008
> -rw-r--r--   3 hdfs hdfs    3680292 2015-07-07 12:05 /data/users/part-00009
> -rw-r--r--   3 hdfs hdfs    3697722 2015-07-07 12:05 /data/users/part-00010
>
> The CUBE build JOB fails when try to build the Dimension Dictionary with
> the following exception (it seems that the Hive Table data directory MUST
> contain only one file)
>
> java.lang.IllegalStateException: Expect 1 and only 1 non-zero file under
> hdfs://gas.gfmintegration.it:8020/data/cdr/bb/dimensions/users, but find
> 11
>         at
> org.apache.kylin.dict.lookup.HiveTable.findOnlyFile(HiveTable.java:123)
>         at
> org.apache.kylin.dict.lookup.HiveTable.computeHDFSLocation(HiveTable.java:107)
>         at
> org.apache.kylin.dict.lookup.HiveTable.getHDFSLocation(HiveTable.java:83)
>         at
> org.apache.kylin.dict.lookup.HiveTable.getFileTable(HiveTable.java:76)
>         at
> org.apache.kylin.dict.lookup.HiveTable.getSignature(HiveTable.java:71)
>         at
> org.apache.kylin.dict.DictionaryManager.buildDictionary(DictionaryManager.java:164)
>         at
> org.apache.kylin.cube.CubeManager.buildDictionary(CubeManager.java:154)
>         at
> org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:53)
>         at
> org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:42)
>         at
> org.apache.kylin.job.hadoop.dict.CreateDictionaryJob.run(CreateDictionaryJob.java:53)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>         at
> org.apache.kylin.job.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:63)
>         at
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:107)
>         at
> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:50)
>         at
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:107)
>         at
> org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:132)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:745)
>
> result code:2
>
>
> Do you have any indications on how to create a proper Hive star schema for
> Kylin?
>
> I would like to use external tables (stored as CSV, parquet files or
> HBase) because I need to process the same data also from Spark.
>
> Thanks in advance.
>
> BR,
>
> -- gas
>
>
>
>

Kylin 0.7.1 - Failed to build a cube

Posted by ga...@gfmintegration.it.

Hi,

I am trying to create a cube from a star schema created using Hive External tables (below an example) stored as TEXT FILE (CSV).

CREATE EXTERNAL TABLE IF NOT EXISTS USERS_TABLE  (
   uid INT,
   name STRING
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\073' LINES TERMINATED BY '\012'
STORED AS TEXTFILE
LOCATION '/data/users';
 

To CSV files are obtained from Spark RDDs, so they are saved as part-xxxx. Below the HDFS listing

hdfs dfs -ls /data/users
Found 12 items
-rw-r--r--   3 hdfs hdfs          0 2015-07-07 12:05 /data/users/_SUCCESS
-rw-r--r--   3 hdfs hdfs    3699360 2015-07-07 12:05 /data/users/part-00000
-rw-r--r--   3 hdfs hdfs    3694740 2015-07-07 12:05 /data/users/part-00001
-rw-r--r--   3 hdfs hdfs    3685374 2015-07-07 12:05 /data/users/part-00002
-rw-r--r--   3 hdfs hdfs    3719646 2015-07-07 12:05 /data/users/part-00003
-rw-r--r--   3 hdfs hdfs    3682476 2015-07-07 12:05 /data/users/part-00004
-rw-r--r--   3 hdfs hdfs    3679956 2015-07-07 12:05 /data/users/part-00005
-rw-r--r--   3 hdfs hdfs    3700242 2015-07-07 12:05 /data/users/part-00006
-rw-r--r--   3 hdfs hdfs    3672186 2015-07-07 12:05 /data/users/part-00007
-rw-r--r--   3 hdfs hdfs    3682350 2015-07-07 12:05 /data/users/part-00008
-rw-r--r--   3 hdfs hdfs    3680292 2015-07-07 12:05 /data/users/part-00009
-rw-r--r--   3 hdfs hdfs    3697722 2015-07-07 12:05 /data/users/part-00010

The CUBE build JOB fails when try to build the Dimension Dictionary with the following exception (it seems that the Hive Table data directory MUST contain only one file)

java.lang.IllegalStateException: Expect 1 and only 1 non-zero file under hdfs://gas.gfmintegration.it:8020/data/cdr/bb/dimensions/users, but find 11
	at org.apache.kylin.dict.lookup.HiveTable.findOnlyFile(HiveTable.java:123)
	at org.apache.kylin.dict.lookup.HiveTable.computeHDFSLocation(HiveTable.java:107)
	at org.apache.kylin.dict.lookup.HiveTable.getHDFSLocation(HiveTable.java:83)
	at org.apache.kylin.dict.lookup.HiveTable.getFileTable(HiveTable.java:76)
	at org.apache.kylin.dict.lookup.HiveTable.getSignature(HiveTable.java:71)
	at org.apache.kylin.dict.DictionaryManager.buildDictionary(DictionaryManager.java:164)
	at org.apache.kylin.cube.CubeManager.buildDictionary(CubeManager.java:154)
	at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:53)
	at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:42)
	at org.apache.kylin.job.hadoop.dict.CreateDictionaryJob.run(CreateDictionaryJob.java:53)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
	at org.apache.kylin.job.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:63)
	at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:107)
	at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:50)
	at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:107)
	at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:132)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)

result code:2


Do you have any indications on how to create a proper Hive star schema for Kylin? 

I would like to use external tables (stored as CSV, parquet files or HBase) because I need to process the same data also from Spark.

Thanks in advance.

BR,

-- gas

Re: Kylin upgraded to Calcite 1.0

Posted by Luke Han <lu...@gmail.com>.

Thanks Yang and Julian, we are able to have latest Calcite capability now.

The Calcite 1.3 will be included in Kylin 0.7.3 release in next month.

For people who want to try now, please checkout "0.7" branch for latest
stable dev code.

Thanks.



Best Regards!
---------------------

Luke Han

On Wed, Jul 8, 2015 at 1:57 AM, Julian Hyde <jh...@apache.org> wrote:

> That’s great. Now Kylin is up to speed, and because Calcite releases about
> once per month, when you log an issue such as
> https://issues.apache.org/jira/browse/CALCITE-788 we can get the issue
> fixed in Calcite, released in Calcite, and into a released version of Kylin
> within a couple of months.
>
> On Jul 7, 2015, at 1:38 AM, Li Yang <li...@apache.org> wrote:
>
> > Yeah, upgraded to 1.3.0 on branch 0.7-staging & 0.8. The calcite-core API
> > change is minimal, while avacita API upgrade took me some time.
> >
> > Kylin shall keep up with latest calcite release from now on.  :-)
> >
> > On Tue, Jul 7, 2015 at 10:17 AM, Li Yang <li...@apache.org> wrote:
> >
> >>> Upgrading to 1.3 should give you a lot of benefit for modest investment
> >>
> >> Sounds charming~~   Let me have a quick try.  :-)
> >>
> >> On Tue, Jul 7, 2015 at 4:18 AM, Julian Hyde <jh...@apache.org> wrote:
> >>
> >>> Great news, and well done!
> >>>
> >>> Getting to 1.0 was the hard part, because of the API changes, but I
> >>> suggest you don’t stop there.
> >>>
> >>> Calcite has continued to evolve over the last 6 months (see release
> notes
> >>> [1]); 1.3 is the latest release and 1.4 is probably a couple of weeks
> away.
> >>> There are a lot of improvements and bug-fixes in each release, and
> since
> >>> these are minor releases post 1.0, we have made efforts to maintain
> >>> compatibility.
> >>>
> >>> Upgrading to 1.3 should give you a lot of benefit for modest
> investment,
> >>> so I think you should do it as soon as possible. I’ll be glad to help.
> >>>
> >>> Julian
> >>>
> >>> [1] http://calcite.incubator.apache.org/docs/history.html
> >>>
> >>>
> >>>
> >>> On Jul 5, 2015, at 11:16 PM, Li Yang <li...@apache.org> wrote:
> >>>
> >>>> KYLIN-780 is finally resolved. The latest branch of 0.7 & 0.8 now uses
> >>>> Calcite 1.0.
> >>>>
> >>>> This means Kylin's SQL capability has stepped up to a higher level.
> E.g.
> >>>> the support of window function should be very close. (It maybe already
> >>>> supported, I just haven't tested yet.)
> >>>>
> >>>>
> >>>> Cheers
> >>>> Yang
> >>>
> >>>
> >>
>
>

Re: Kylin upgraded to Calcite 1.0

Posted by Luke Han <lu...@gmail.com>.

Thanks Yang and Julian, we are able to have latest Calcite capability now.

The Calcite 1.3 will be included in Kylin 0.7.3 release in next month.

For people who want to try now, please checkout "0.7" branch for latest
stable dev code.

Thanks.



Best Regards!
---------------------

Luke Han

On Wed, Jul 8, 2015 at 1:57 AM, Julian Hyde <jh...@apache.org> wrote:

> That’s great. Now Kylin is up to speed, and because Calcite releases about
> once per month, when you log an issue such as
> https://issues.apache.org/jira/browse/CALCITE-788 we can get the issue
> fixed in Calcite, released in Calcite, and into a released version of Kylin
> within a couple of months.
>
> On Jul 7, 2015, at 1:38 AM, Li Yang <li...@apache.org> wrote:
>
> > Yeah, upgraded to 1.3.0 on branch 0.7-staging & 0.8. The calcite-core API
> > change is minimal, while avacita API upgrade took me some time.
> >
> > Kylin shall keep up with latest calcite release from now on.  :-)
> >
> > On Tue, Jul 7, 2015 at 10:17 AM, Li Yang <li...@apache.org> wrote:
> >
> >>> Upgrading to 1.3 should give you a lot of benefit for modest investment
> >>
> >> Sounds charming~~   Let me have a quick try.  :-)
> >>
> >> On Tue, Jul 7, 2015 at 4:18 AM, Julian Hyde <jh...@apache.org> wrote:
> >>
> >>> Great news, and well done!
> >>>
> >>> Getting to 1.0 was the hard part, because of the API changes, but I
> >>> suggest you don’t stop there.
> >>>
> >>> Calcite has continued to evolve over the last 6 months (see release
> notes
> >>> [1]); 1.3 is the latest release and 1.4 is probably a couple of weeks
> away.
> >>> There are a lot of improvements and bug-fixes in each release, and
> since
> >>> these are minor releases post 1.0, we have made efforts to maintain
> >>> compatibility.
> >>>
> >>> Upgrading to 1.3 should give you a lot of benefit for modest
> investment,
> >>> so I think you should do it as soon as possible. I’ll be glad to help.
> >>>
> >>> Julian
> >>>
> >>> [1] http://calcite.incubator.apache.org/docs/history.html
> >>>
> >>>
> >>>
> >>> On Jul 5, 2015, at 11:16 PM, Li Yang <li...@apache.org> wrote:
> >>>
> >>>> KYLIN-780 is finally resolved. The latest branch of 0.7 & 0.8 now uses
> >>>> Calcite 1.0.
> >>>>
> >>>> This means Kylin's SQL capability has stepped up to a higher level.
> E.g.
> >>>> the support of window function should be very close. (It maybe already
> >>>> supported, I just haven't tested yet.)
> >>>>
> >>>>
> >>>> Cheers
> >>>> Yang
> >>>
> >>>
> >>
>
>

Re: Kylin upgraded to Calcite 1.0

Posted by Julian Hyde <jh...@apache.org>.

That’s great. Now Kylin is up to speed, and because Calcite releases about once per month, when you log an issue such as https://issues.apache.org/jira/browse/CALCITE-788 we can get the issue fixed in Calcite, released in Calcite, and into a released version of Kylin within a couple of months.

On Jul 7, 2015, at 1:38 AM, Li Yang <li...@apache.org> wrote:

> Yeah, upgraded to 1.3.0 on branch 0.7-staging & 0.8. The calcite-core API
> change is minimal, while avacita API upgrade took me some time.
> 
> Kylin shall keep up with latest calcite release from now on.  :-)
> 
> On Tue, Jul 7, 2015 at 10:17 AM, Li Yang <li...@apache.org> wrote:
> 
>>> Upgrading to 1.3 should give you a lot of benefit for modest investment
>> 
>> Sounds charming~~   Let me have a quick try.  :-)
>> 
>> On Tue, Jul 7, 2015 at 4:18 AM, Julian Hyde <jh...@apache.org> wrote:
>> 
>>> Great news, and well done!
>>> 
>>> Getting to 1.0 was the hard part, because of the API changes, but I
>>> suggest you don’t stop there.
>>> 
>>> Calcite has continued to evolve over the last 6 months (see release notes
>>> [1]); 1.3 is the latest release and 1.4 is probably a couple of weeks away.
>>> There are a lot of improvements and bug-fixes in each release, and since
>>> these are minor releases post 1.0, we have made efforts to maintain
>>> compatibility.
>>> 
>>> Upgrading to 1.3 should give you a lot of benefit for modest investment,
>>> so I think you should do it as soon as possible. I’ll be glad to help.
>>> 
>>> Julian
>>> 
>>> [1] http://calcite.incubator.apache.org/docs/history.html
>>> 
>>> 
>>> 
>>> On Jul 5, 2015, at 11:16 PM, Li Yang <li...@apache.org> wrote:
>>> 
>>>> KYLIN-780 is finally resolved. The latest branch of 0.7 & 0.8 now uses
>>>> Calcite 1.0.
>>>> 
>>>> This means Kylin's SQL capability has stepped up to a higher level. E.g.
>>>> the support of window function should be very close. (It maybe already
>>>> supported, I just haven't tested yet.)
>>>> 
>>>> 
>>>> Cheers
>>>> Yang
>>> 
>>> 
>>

Re: Kylin upgraded to Calcite 1.0

Posted by Julian Hyde <jh...@apache.org>.

That’s great. Now Kylin is up to speed, and because Calcite releases about once per month, when you log an issue such as https://issues.apache.org/jira/browse/CALCITE-788 we can get the issue fixed in Calcite, released in Calcite, and into a released version of Kylin within a couple of months.

On Jul 7, 2015, at 1:38 AM, Li Yang <li...@apache.org> wrote:

> Yeah, upgraded to 1.3.0 on branch 0.7-staging & 0.8. The calcite-core API
> change is minimal, while avacita API upgrade took me some time.
> 
> Kylin shall keep up with latest calcite release from now on.  :-)
> 
> On Tue, Jul 7, 2015 at 10:17 AM, Li Yang <li...@apache.org> wrote:
> 
>>> Upgrading to 1.3 should give you a lot of benefit for modest investment
>> 
>> Sounds charming~~   Let me have a quick try.  :-)
>> 
>> On Tue, Jul 7, 2015 at 4:18 AM, Julian Hyde <jh...@apache.org> wrote:
>> 
>>> Great news, and well done!
>>> 
>>> Getting to 1.0 was the hard part, because of the API changes, but I
>>> suggest you don’t stop there.
>>> 
>>> Calcite has continued to evolve over the last 6 months (see release notes
>>> [1]); 1.3 is the latest release and 1.4 is probably a couple of weeks away.
>>> There are a lot of improvements and bug-fixes in each release, and since
>>> these are minor releases post 1.0, we have made efforts to maintain
>>> compatibility.
>>> 
>>> Upgrading to 1.3 should give you a lot of benefit for modest investment,
>>> so I think you should do it as soon as possible. I’ll be glad to help.
>>> 
>>> Julian
>>> 
>>> [1] http://calcite.incubator.apache.org/docs/history.html
>>> 
>>> 
>>> 
>>> On Jul 5, 2015, at 11:16 PM, Li Yang <li...@apache.org> wrote:
>>> 
>>>> KYLIN-780 is finally resolved. The latest branch of 0.7 & 0.8 now uses
>>>> Calcite 1.0.
>>>> 
>>>> This means Kylin's SQL capability has stepped up to a higher level. E.g.
>>>> the support of window function should be very close. (It maybe already
>>>> supported, I just haven't tested yet.)
>>>> 
>>>> 
>>>> Cheers
>>>> Yang
>>> 
>>> 
>>

Kylin 0.7.1 - Failed to build a cube

Posted by ga...@gfmintegration.it.

Hi,

I am trying to create a cube from a star schema created using Hive External tables (below an example) stored as TEXT FILE (CSV).

CREATE EXTERNAL TABLE IF NOT EXISTS USERS_TABLE  (
   uid INT,
   name STRING
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\073' LINES TERMINATED BY '\012'
STORED AS TEXTFILE
LOCATION '/data/users';
 

To CSV files are obtained from Spark RDDs, so they are saved as part-xxxx. Below the HDFS listing

hdfs dfs -ls /data/users
Found 12 items
-rw-r--r--   3 hdfs hdfs          0 2015-07-07 12:05 /data/users/_SUCCESS
-rw-r--r--   3 hdfs hdfs    3699360 2015-07-07 12:05 /data/users/part-00000
-rw-r--r--   3 hdfs hdfs    3694740 2015-07-07 12:05 /data/users/part-00001
-rw-r--r--   3 hdfs hdfs    3685374 2015-07-07 12:05 /data/users/part-00002
-rw-r--r--   3 hdfs hdfs    3719646 2015-07-07 12:05 /data/users/part-00003
-rw-r--r--   3 hdfs hdfs    3682476 2015-07-07 12:05 /data/users/part-00004
-rw-r--r--   3 hdfs hdfs    3679956 2015-07-07 12:05 /data/users/part-00005
-rw-r--r--   3 hdfs hdfs    3700242 2015-07-07 12:05 /data/users/part-00006
-rw-r--r--   3 hdfs hdfs    3672186 2015-07-07 12:05 /data/users/part-00007
-rw-r--r--   3 hdfs hdfs    3682350 2015-07-07 12:05 /data/users/part-00008
-rw-r--r--   3 hdfs hdfs    3680292 2015-07-07 12:05 /data/users/part-00009
-rw-r--r--   3 hdfs hdfs    3697722 2015-07-07 12:05 /data/users/part-00010

The CUBE build JOB fails when try to build the Dimension Dictionary with the following exception (it seems that the Hive Table data directory MUST contain only one file)

java.lang.IllegalStateException: Expect 1 and only 1 non-zero file under hdfs://gas.gfmintegration.it:8020/data/cdr/bb/dimensions/users, but find 11
	at org.apache.kylin.dict.lookup.HiveTable.findOnlyFile(HiveTable.java:123)
	at org.apache.kylin.dict.lookup.HiveTable.computeHDFSLocation(HiveTable.java:107)
	at org.apache.kylin.dict.lookup.HiveTable.getHDFSLocation(HiveTable.java:83)
	at org.apache.kylin.dict.lookup.HiveTable.getFileTable(HiveTable.java:76)
	at org.apache.kylin.dict.lookup.HiveTable.getSignature(HiveTable.java:71)
	at org.apache.kylin.dict.DictionaryManager.buildDictionary(DictionaryManager.java:164)
	at org.apache.kylin.cube.CubeManager.buildDictionary(CubeManager.java:154)
	at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:53)
	at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:42)
	at org.apache.kylin.job.hadoop.dict.CreateDictionaryJob.run(CreateDictionaryJob.java:53)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
	at org.apache.kylin.job.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:63)
	at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:107)
	at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:50)
	at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:107)
	at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:132)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)

result code:2


Do you have any indications on how to create a proper Hive star schema for Kylin? 

I would like to use external tables (stored as CSV, parquet files or HBase) because I need to process the same data also from Spark.

Thanks in advance.

BR,

-- gas

Re: Kylin upgraded to Calcite 1.0

Posted by Li Yang <li...@apache.org>.

Yeah, upgraded to 1.3.0 on branch 0.7-staging & 0.8. The calcite-core API
change is minimal, while avacita API upgrade took me some time.

Kylin shall keep up with latest calcite release from now on.  :-)

On Tue, Jul 7, 2015 at 10:17 AM, Li Yang <li...@apache.org> wrote:

> > Upgrading to 1.3 should give you a lot of benefit for modest investment
>
> Sounds charming~~   Let me have a quick try.  :-)
>
> On Tue, Jul 7, 2015 at 4:18 AM, Julian Hyde <jh...@apache.org> wrote:
>
>> Great news, and well done!
>>
>> Getting to 1.0 was the hard part, because of the API changes, but I
>> suggest you don’t stop there.
>>
>> Calcite has continued to evolve over the last 6 months (see release notes
>> [1]); 1.3 is the latest release and 1.4 is probably a couple of weeks away.
>> There are a lot of improvements and bug-fixes in each release, and since
>> these are minor releases post 1.0, we have made efforts to maintain
>> compatibility.
>>
>> Upgrading to 1.3 should give you a lot of benefit for modest investment,
>> so I think you should do it as soon as possible. I’ll be glad to help.
>>
>> Julian
>>
>> [1] http://calcite.incubator.apache.org/docs/history.html
>>
>>
>>
>> On Jul 5, 2015, at 11:16 PM, Li Yang <li...@apache.org> wrote:
>>
>> > KYLIN-780 is finally resolved. The latest branch of 0.7 & 0.8 now uses
>> > Calcite 1.0.
>> >
>> > This means Kylin's SQL capability has stepped up to a higher level. E.g.
>> > the support of window function should be very close. (It maybe already
>> > supported, I just haven't tested yet.)
>> >
>> >
>> > Cheers
>> > Yang
>>
>>
>

Re: Kylin upgraded to Calcite 1.0

Posted by Li Yang <li...@apache.org>.

Yeah, upgraded to 1.3.0 on branch 0.7-staging & 0.8. The calcite-core API
change is minimal, while avacita API upgrade took me some time.

Kylin shall keep up with latest calcite release from now on.  :-)

On Tue, Jul 7, 2015 at 10:17 AM, Li Yang <li...@apache.org> wrote:

> > Upgrading to 1.3 should give you a lot of benefit for modest investment
>
> Sounds charming~~   Let me have a quick try.  :-)
>
> On Tue, Jul 7, 2015 at 4:18 AM, Julian Hyde <jh...@apache.org> wrote:
>
>> Great news, and well done!
>>
>> Getting to 1.0 was the hard part, because of the API changes, but I
>> suggest you don’t stop there.
>>
>> Calcite has continued to evolve over the last 6 months (see release notes
>> [1]); 1.3 is the latest release and 1.4 is probably a couple of weeks away.
>> There are a lot of improvements and bug-fixes in each release, and since
>> these are minor releases post 1.0, we have made efforts to maintain
>> compatibility.
>>
>> Upgrading to 1.3 should give you a lot of benefit for modest investment,
>> so I think you should do it as soon as possible. I’ll be glad to help.
>>
>> Julian
>>
>> [1] http://calcite.incubator.apache.org/docs/history.html
>>
>>
>>
>> On Jul 5, 2015, at 11:16 PM, Li Yang <li...@apache.org> wrote:
>>
>> > KYLIN-780 is finally resolved. The latest branch of 0.7 & 0.8 now uses
>> > Calcite 1.0.
>> >
>> > This means Kylin's SQL capability has stepped up to a higher level. E.g.
>> > the support of window function should be very close. (It maybe already
>> > supported, I just haven't tested yet.)
>> >
>> >
>> > Cheers
>> > Yang
>>
>>
>

Re: Kylin upgraded to Calcite 1.0

Posted by Li Yang <li...@apache.org>.

> Upgrading to 1.3 should give you a lot of benefit for modest investment

Sounds charming~~   Let me have a quick try.  :-)

On Tue, Jul 7, 2015 at 4:18 AM, Julian Hyde <jh...@apache.org> wrote:

> Great news, and well done!
>
> Getting to 1.0 was the hard part, because of the API changes, but I
> suggest you don’t stop there.
>
> Calcite has continued to evolve over the last 6 months (see release notes
> [1]); 1.3 is the latest release and 1.4 is probably a couple of weeks away.
> There are a lot of improvements and bug-fixes in each release, and since
> these are minor releases post 1.0, we have made efforts to maintain
> compatibility.
>
> Upgrading to 1.3 should give you a lot of benefit for modest investment,
> so I think you should do it as soon as possible. I’ll be glad to help.
>
> Julian
>
> [1] http://calcite.incubator.apache.org/docs/history.html
>
>
>
> On Jul 5, 2015, at 11:16 PM, Li Yang <li...@apache.org> wrote:
>
> > KYLIN-780 is finally resolved. The latest branch of 0.7 & 0.8 now uses
> > Calcite 1.0.
> >
> > This means Kylin's SQL capability has stepped up to a higher level. E.g.
> > the support of window function should be very close. (It maybe already
> > supported, I just haven't tested yet.)
> >
> >
> > Cheers
> > Yang
>
>

Re: Kylin upgraded to Calcite 1.0

Posted by Julian Hyde <jh...@apache.org>.

Great news, and well done!

Getting to 1.0 was the hard part, because of the API changes, but I suggest you don’t stop there.

Calcite has continued to evolve over the last 6 months (see release notes [1]); 1.3 is the latest release and 1.4 is probably a couple of weeks away. There are a lot of improvements and bug-fixes in each release, and since these are minor releases post 1.0, we have made efforts to maintain compatibility.

Upgrading to 1.3 should give you a lot of benefit for modest investment, so I think you should do it as soon as possible. I’ll be glad to help.

Julian

[1] http://calcite.incubator.apache.org/docs/history.html

On Jul 5, 2015, at 11:16 PM, Li Yang <li...@apache.org> wrote:

> KYLIN-780 is finally resolved. The latest branch of 0.7 & 0.8 now uses
> Calcite 1.0.
> 
> This means Kylin's SQL capability has stepped up to a higher level. E.g.
> the support of window function should be very close. (It maybe already
> supported, I just haven't tested yet.)
> 
> 
> Cheers
> Yang

Re: Kylin upgraded to Calcite 1.0

Posted by Julian Hyde <jh...@apache.org>.

Great news, and well done!

Getting to 1.0 was the hard part, because of the API changes, but I suggest you don’t stop there.

Calcite has continued to evolve over the last 6 months (see release notes [1]); 1.3 is the latest release and 1.4 is probably a couple of weeks away. There are a lot of improvements and bug-fixes in each release, and since these are minor releases post 1.0, we have made efforts to maintain compatibility.

Upgrading to 1.3 should give you a lot of benefit for modest investment, so I think you should do it as soon as possible. I’ll be glad to help.

Julian

[1] http://calcite.incubator.apache.org/docs/history.html

On Jul 5, 2015, at 11:16 PM, Li Yang <li...@apache.org> wrote:

> KYLIN-780 is finally resolved. The latest branch of 0.7 & 0.8 now uses
> Calcite 1.0.
> 
> This means Kylin's SQL capability has stepped up to a higher level. E.g.
> the support of window function should be very close. (It maybe already
> supported, I just haven't tested yet.)
> 
> 
> Cheers
> Yang