You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@kylin.apache.org by Tony Lee <bt...@gmail.com> on 2016/09/20 12:50:28 UTC
Error while building cube from stream
Hi,
I was building cube from stream as the document(
http://kylin.apache.org/docs15/tutorial/cube_streaming.html
) says.
I was using 1.5.3, and i encounter this error. Same error on 1.5.4.
Everything fine on 1.5.2.1.
Any idea how to solve this?
2016-09-20 20:31:51,520 INFO [main KafkaStreamingInput:129]: finish to get
streaming batch, total message count:30
2016-09-20 20:31:51,532 DEBUG [main CubeManager:855]: Reloaded new cube:
STREAMING_CUBE with reference beingCUBE[name=STREAMING_CUBE] having 1
segments:KYLIN_2822I1W3CX
2016-09-20 20:31:51,536 INFO [main CubeManager:314]: Updating cube
instance 'STREAMING_CUBE'
2016-09-20 20:31:51,538 WARN [main StreamingCLI:127]: invalid
args:streaming start STREAMING_CUBE 1474374540000_1474374600000 -start
1474374540000 -end 1474374600000 -cube STREAMING_CUBE
2016-09-20 20:31:51,539 ERROR [main StreamingCLI:103]: error start streaming
java.lang.IllegalStateException: Segments overlap:
STREAMING_CUBE[FULL_BUILD] and STREAMING_CUBE[FULL_BUILD]
at org.apache.kylin.cube.CubeValidator.validate(CubeValidator.java:85)
at
org.apache.kylin.cube.CubeManager.updateCubeWithRetry(CubeManager.java:358)
at org.apache.kylin.cube.CubeManager.updateCube(CubeManager.java:301)
at org.apache.kylin.cube.CubeManager.appendSegment(CubeManager.java:441)
at
org.apache.kylin.engine.streaming.cube.StreamingCubeBuilder.createBuildable(StreamingCubeBuilder.java:118)
at
org.apache.kylin.engine.streaming.OneOffStreamingBuilder$1.run(OneOffStreamingBuilder.java:76)
at
org.apache.kylin.engine.streaming.cli.StreamingCLI.startOneOffCubeStreaming(StreamingCLI.java:123)
at
org.apache.kylin.engine.streaming.cli.StreamingCLI.main(StreamingCLI.java:97)
2016-09-20 20:31:51,543 INFO [Thread-0
ConnectionManager$HConnectionImplementation:1678]: Closing zookeeper
sessionid=0x35708fbc2740013
2016-09-20 20:31:51,549 INFO [Thread-0 ZooKeeper:684]: Session:
0x35708fbc2740013 closed
2016-09-20 20:31:51,549 INFO [main-EventThread ClientCnxn:512]:
EventThread shut down
Re: Error while building cube from stream
Posted by Tony Lee <bt...@gmail.com>.
Thanks for you replying.
I have create an issue here.
https://issues.apache.org/jira/browse/KYLIN-2053
On Mon, Sep 26, 2016 at 4:59 PM, ShaoFeng Shi <sh...@apache.org>
wrote:
> Hi Tony,
>
> You're correct; The global dictionary wasn't supported in stream builder
> (this is the first reporting); Could you please open a JIRA?
> https://issues.apache.org/jira/secure/Dashboard.jspa
>
> BTW, we're developing the new version of streaming engine, which will
> reuse most of the logic of batch cubing engine, planned to roll out in
> v1.6. I believe with the new design there will have no such issue.
>
> 2016-09-26 14:56 GMT+08:00 Tony Lee <bt...@gmail.com>:
>
>> Thanks
>>
>> But this does not work on streaming cube.
>>
>> I read some code and found that in class *StreamingCubeBuilder,* the
>> dictionary map was built by *DictionaryGenerator.buildDictionary()*
>> instead of *DictionaryManager.buildDictionary()*. Does this mean that
>> streaming cube does not support global dictionary?
>>
>> I add USERID to the dimensions, then the cube was built successfully. But
>> I think the result will be incorrect if I calculate count distinct in
>> different segments. Is that right
>>
>>
>> Tony
>>
>> On Sat, Sep 24, 2016 at 10:29 PM, ShaoFeng Shi <sh...@apache.org>
>> wrote:
>>
>>> Hi Tony,
>>>
>>> The error was occurred when building a bitmap counter (for distinct
>>> count); from your cube descriptor, it seems there is no global dictionary
>>> be specified for the user id column. Please check this blog:
>>> https://kylin.apache.org/blog/2016/08/01/count-distinct-in-kylin/
>>>
>>> 2016-09-22 10:49 GMT+08:00 Tony Lee <bt...@gmail.com>:
>>>
>>>> Thanks, ShaoFeng Shi. That is the reason.
>>>>
>>>> But unfortunately, I have a new problem about count distinct
>>>> (precisely)
>>>>
>>>> I added a streaming table on version 1.5.4 with my own json, which is
>>>> like this
>>>> {
>>>> "logTimestamp":1474456891127,
>>>> "datetime":"2016-09-21 19:21:31",
>>>> "uploadTime":"20160921192023",
>>>> "userId":"f2d28cbf9e21340a49e97063486db1f5",
>>>> "accountId":"84108490",
>>>> "otherfield":"...."
>>>> }
>>>>
>>>> *The error message while building the cube is*
>>>>
>>>> 2016-09-22 10:01:40,731 ERROR [main StreamingCLI:103]: error start
>>>> streaming
>>>> java.lang.RuntimeException: error build cube from StreamingBatch
>>>> at org.apache.kylin.engine.streaming.cube.StreamingCubeBuilder.
>>>> build(StreamingCubeBuilder.java:105)
>>>> at org.apache.kylin.engine.streaming.OneOffStreamingBuilder$1.r
>>>> un(OneOffStreamingBuilder.java:79)
>>>> at org.apache.kylin.engine.streaming.cli.StreamingCLI.startOneO
>>>> ffCubeStreaming(StreamingCLI.java:123)
>>>> at org.apache.kylin.engine.streaming.cli.StreamingCLI.main(Stre
>>>> amingCLI.java:97)
>>>> Caused by: java.lang.NullPointerException
>>>> at org.apache.kylin.measure.bitmap.BitmapMeasureType$1.valueOf(
>>>> BitmapMeasureType.java:100)
>>>> at org.apache.kylin.measure.bitmap.BitmapMeasureType$1.valueOf(
>>>> BitmapMeasureType.java:89)
>>>> at org.apache.kylin.cube.inmemcubing.InMemCubeBuilderInputConve
>>>> rter.buildValueOf(InMemCubeBuilderInputConverter.java:122)
>>>> at org.apache.kylin.cube.inmemcubing.InMemCubeBuilderInputConve
>>>> rter.buildValue(InMemCubeBuilderInputConverter.java:94)
>>>> at org.apache.kylin.cube.inmemcubing.InMemCubeBuilderInputConve
>>>> rter.convert(InMemCubeBuilderInputConverter.java:70)
>>>> at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder$InputConv
>>>> erter$1.next(InMemCubeBuilder.java:542)
>>>> at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder$InputConv
>>>> erter$1.next(InMemCubeBuilder.java:523)
>>>> at org.apache.kylin.gridtable.GTAggregateScanner.iterator(GTAgg
>>>> regateScanner.java:139)
>>>> at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder.createBas
>>>> eCuboid(InMemCubeBuilder.java:339)
>>>> at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder.build(InM
>>>> emCubeBuilder.java:166)
>>>> at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder.build(InM
>>>> emCubeBuilder.java:135)
>>>> at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder.build(InM
>>>> emCubeBuilder.java:122)
>>>> at org.apache.kylin.cube.inmemcubing.AbstractInMemCubeBuilder$1
>>>> .run(AbstractInMemCubeBuilder.java:80)
>>>> at java.util.concurrent.Executors$RunnableAdapter.call(Executor
>>>> s.java:471)
>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>>> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
>>>> Executor.java:1145)
>>>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
>>>> lExecutor.java:615)
>>>> at java.lang.Thread.run(Thread.java:745)
>>>>
>>>>
>>>> *and the cube json is*
>>>> {
>>>> "uuid": "db91bcea-b33f-48af-a2f5-6014b14031f4",
>>>> "last_modified": 1474511879506,
>>>> "version": "1.5.4",
>>>> "name": "hot_play_c",
>>>> "model_name": "hot_play_cube",
>>>> "description": "",
>>>> "null_string": null,
>>>> "dimensions": [
>>>> {
>>>> "name": "DEFAULT.HOT_PLAY.HOUR_START",
>>>> "table": "DEFAULT.HOT_PLAY",
>>>> "column": "HOUR_START",
>>>> "derived": null
>>>> },
>>>> {
>>>> "name": "DEFAULT.HOT_PLAY.MINUTE_START",
>>>> "table": "DEFAULT.HOT_PLAY",
>>>> "column": "MINUTE_START",
>>>> "derived": null
>>>> }
>>>> ],
>>>> "measures": [
>>>> {
>>>> "name": "_COUNT_",
>>>> "function": {
>>>> "expression": "COUNT",
>>>> "parameter": {
>>>> "type": "constant",
>>>> "value": "1",
>>>> "next_parameter": null
>>>> },
>>>> "returntype": "bigint"
>>>> },
>>>> "dependent_measure_ref": null
>>>> },
>>>> {
>>>> "name": "COUNT_DISTINCT_USER",
>>>> "function": {
>>>> "expression": "COUNT_DISTINCT",
>>>> "parameter": {
>>>> "type": "column",
>>>> "value": "USERID",
>>>> "next_parameter": null
>>>> },
>>>> "returntype": "bitmap"
>>>> },
>>>> "dependent_measure_ref": null
>>>> }
>>>> ],
>>>> "dictionaries": [],
>>>> "rowkey": {
>>>> "rowkey_columns": [
>>>> {
>>>> "column": "HOUR_START",
>>>> "encoding": "time",
>>>> "isShardBy": false
>>>> },
>>>> {
>>>> "column": "MINUTE_START",
>>>> "encoding": "time",
>>>> "isShardBy": false
>>>> }
>>>> ]
>>>> },
>>>> "hbase_mapping": {
>>>> "column_family": [
>>>> {
>>>> "name": "F1",
>>>> "columns": [
>>>> {
>>>> "qualifier": "M",
>>>> "measure_refs": [
>>>> "_COUNT_"
>>>> ]
>>>> }
>>>> ]
>>>> },
>>>> {
>>>> "name": "F2",
>>>> "columns": [
>>>> {
>>>> "qualifier": "M",
>>>> "measure_refs": [
>>>> "COUNT_DISTINCT_USER"
>>>> ]
>>>> }
>>>> ]
>>>> }
>>>> ]
>>>> },
>>>> "aggregation_groups": [
>>>> {
>>>> "includes": [
>>>> "HOUR_START",
>>>> "MINUTE_START"
>>>> ],
>>>> "select_rule": {
>>>> "hierarchy_dims": [],
>>>> "mandatory_dims": [],
>>>> "joint_dims": []
>>>> }
>>>> }
>>>> ],
>>>> "signature": "QXddyWCVVCYQcozxd4Zh2w==",
>>>> "notify_list": [],
>>>> "status_need_notify": [
>>>> "ERROR",
>>>> "DISCARDED",
>>>> "SUCCEED"
>>>> ],
>>>> "partition_date_start": 0,
>>>> "partition_date_end": 3153600000000,
>>>> "auto_merge_time_ranges": [
>>>> 604800000,
>>>> 2419200000
>>>> ],
>>>> "retention_range": 0,
>>>> "engine_type": 2,
>>>> "storage_type": 2,
>>>> "override_kylin_properties": {}
>>>> }
>>>>
>>>> *no error after i change the returntype to hllc(16)*
>>>>
>>>> *i have struggled for several days. Any hints about this?*
>>>>
>>>> On Wed, Sep 21, 2016 at 10:47 PM, ShaoFeng Shi <sh...@apache.org>
>>>> wrote:
>>>>
>>>>> Hi Tony,
>>>>>
>>>>> It seems your cube isn't partitioned (no partition date column
>>>>> specified); please check or provide the cube JSON.
>>>>>
>>>>> 2016-09-21 0:30 GMT+08:00 Alberto Ramón <a....@gmail.com>:
>>>>>
>>>>>> I don't know but , can you check this change?: KYLIN-1744
>>>>>> <https://issues.apache.org/jira/browse/KYLIN-1744> in V1.3
>>>>>>
>>>>>>
>>>>>> 2016-09-20 14:50 GMT+02:00 Tony Lee <bt...@gmail.com>:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I was building cube from stream as the document(
>>>>>>> http://kylin.apache.org/docs15/tutorial/cube_streaming.html
>>>>>>>
>>>>>>> ) says.
>>>>>>>
>>>>>>> I was using 1.5.3, and i encounter this error. Same error on 1.5.4.
>>>>>>> Everything fine on 1.5.2.1.
>>>>>>>
>>>>>>> Any idea how to solve this?
>>>>>>>
>>>>>>>
>>>>>>> 2016-09-20 20:31:51,520 INFO [main KafkaStreamingInput:129]: finish
>>>>>>> to get streaming batch, total message count:30
>>>>>>> 2016-09-20 20:31:51,532 DEBUG [main CubeManager:855]: Reloaded new
>>>>>>> cube: STREAMING_CUBE with reference beingCUBE[name=STREAMING_CUBE] having 1
>>>>>>> segments:KYLIN_2822I1W3CX
>>>>>>> 2016-09-20 20:31:51,536 INFO [main CubeManager:314]: Updating cube
>>>>>>> instance 'STREAMING_CUBE'
>>>>>>> 2016-09-20 20:31:51,538 WARN [main StreamingCLI:127]: invalid
>>>>>>> args:streaming start STREAMING_CUBE 1474374540000_1474374600000 -start
>>>>>>> 1474374540000 -end 1474374600000 -cube STREAMING_CUBE
>>>>>>> 2016-09-20 20:31:51,539 ERROR [main StreamingCLI:103]: error start
>>>>>>> streaming
>>>>>>> java.lang.IllegalStateException: Segments overlap:
>>>>>>> STREAMING_CUBE[FULL_BUILD] and STREAMING_CUBE[FULL_BUILD]
>>>>>>> at org.apache.kylin.cube.CubeValidator.validate(CubeValidator.j
>>>>>>> ava:85)
>>>>>>> at org.apache.kylin.cube.CubeManager.updateCubeWithRetry(CubeMa
>>>>>>> nager.java:358)
>>>>>>> at org.apache.kylin.cube.CubeManager.updateCube(CubeManager.jav
>>>>>>> a:301)
>>>>>>> at org.apache.kylin.cube.CubeManager.appendSegment(CubeManager.
>>>>>>> java:441)
>>>>>>> at org.apache.kylin.engine.streaming.cube.StreamingCubeBuilder.
>>>>>>> createBuildable(StreamingCubeBuilder.java:118)
>>>>>>> at org.apache.kylin.engine.streaming.OneOffStreamingBuilder$1.r
>>>>>>> un(OneOffStreamingBuilder.java:76)
>>>>>>> at org.apache.kylin.engine.streaming.cli.StreamingCLI.startOneO
>>>>>>> ffCubeStreaming(StreamingCLI.java:123)
>>>>>>> at org.apache.kylin.engine.streaming.cli.StreamingCLI.main(Stre
>>>>>>> amingCLI.java:97)
>>>>>>> 2016-09-20 20:31:51,543 INFO [Thread-0
>>>>>>> ConnectionManager$HConnectionImplementation:1678]: Closing
>>>>>>> zookeeper sessionid=0x35708fbc2740013
>>>>>>> 2016-09-20 20:31:51,549 INFO [Thread-0 ZooKeeper:684]: Session:
>>>>>>> 0x35708fbc2740013 closed
>>>>>>> 2016-09-20 20:31:51,549 INFO [main-EventThread ClientCnxn:512]:
>>>>>>> EventThread shut down
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best regards,
>>>>>
>>>>> Shaofeng Shi 史少锋
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Best regards,
>>>
>>> Shaofeng Shi 史少锋
>>>
>>>
>>
>
>
> --
> Best regards,
>
> Shaofeng Shi 史少锋
>
>
Re: Error while building cube from stream
Posted by ShaoFeng Shi <sh...@apache.org>.
Hi Tony,
You're correct; The global dictionary wasn't supported in stream builder
(this is the first reporting); Could you please open a JIRA?
https://issues.apache.org/jira/secure/Dashboard.jspa
BTW, we're developing the new version of streaming engine, which will reuse
most of the logic of batch cubing engine, planned to roll out in v1.6. I
believe with the new design there will have no such issue.
2016-09-26 14:56 GMT+08:00 Tony Lee <bt...@gmail.com>:
> Thanks
>
> But this does not work on streaming cube.
>
> I read some code and found that in class *StreamingCubeBuilder,* the
> dictionary map was built by *DictionaryGenerator.buildDictionary()*
> instead of *DictionaryManager.buildDictionary()*. Does this mean that
> streaming cube does not support global dictionary?
>
> I add USERID to the dimensions, then the cube was built successfully. But
> I think the result will be incorrect if I calculate count distinct in
> different segments. Is that right
>
>
> Tony
>
> On Sat, Sep 24, 2016 at 10:29 PM, ShaoFeng Shi <sh...@apache.org>
> wrote:
>
>> Hi Tony,
>>
>> The error was occurred when building a bitmap counter (for distinct
>> count); from your cube descriptor, it seems there is no global dictionary
>> be specified for the user id column. Please check this blog:
>> https://kylin.apache.org/blog/2016/08/01/count-distinct-in-kylin/
>>
>> 2016-09-22 10:49 GMT+08:00 Tony Lee <bt...@gmail.com>:
>>
>>> Thanks, ShaoFeng Shi. That is the reason.
>>>
>>> But unfortunately, I have a new problem about count distinct (precisely)
>>>
>>> I added a streaming table on version 1.5.4 with my own json, which is
>>> like this
>>> {
>>> "logTimestamp":1474456891127,
>>> "datetime":"2016-09-21 19:21:31",
>>> "uploadTime":"20160921192023",
>>> "userId":"f2d28cbf9e21340a49e97063486db1f5",
>>> "accountId":"84108490",
>>> "otherfield":"...."
>>> }
>>>
>>> *The error message while building the cube is*
>>>
>>> 2016-09-22 10:01:40,731 ERROR [main StreamingCLI:103]: error start
>>> streaming
>>> java.lang.RuntimeException: error build cube from StreamingBatch
>>> at org.apache.kylin.engine.streaming.cube.StreamingCubeBuilder.
>>> build(StreamingCubeBuilder.java:105)
>>> at org.apache.kylin.engine.streaming.OneOffStreamingBuilder$1.r
>>> un(OneOffStreamingBuilder.java:79)
>>> at org.apache.kylin.engine.streaming.cli.StreamingCLI.startOneO
>>> ffCubeStreaming(StreamingCLI.java:123)
>>> at org.apache.kylin.engine.streaming.cli.StreamingCLI.main(Stre
>>> amingCLI.java:97)
>>> Caused by: java.lang.NullPointerException
>>> at org.apache.kylin.measure.bitmap.BitmapMeasureType$1.valueOf(
>>> BitmapMeasureType.java:100)
>>> at org.apache.kylin.measure.bitmap.BitmapMeasureType$1.valueOf(
>>> BitmapMeasureType.java:89)
>>> at org.apache.kylin.cube.inmemcubing.InMemCubeBuilderInputConve
>>> rter.buildValueOf(InMemCubeBuilderInputConverter.java:122)
>>> at org.apache.kylin.cube.inmemcubing.InMemCubeBuilderInputConve
>>> rter.buildValue(InMemCubeBuilderInputConverter.java:94)
>>> at org.apache.kylin.cube.inmemcubing.InMemCubeBuilderInputConve
>>> rter.convert(InMemCubeBuilderInputConverter.java:70)
>>> at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder$InputConv
>>> erter$1.next(InMemCubeBuilder.java:542)
>>> at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder$InputConv
>>> erter$1.next(InMemCubeBuilder.java:523)
>>> at org.apache.kylin.gridtable.GTAggregateScanner.iterator(GTAgg
>>> regateScanner.java:139)
>>> at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder.createBas
>>> eCuboid(InMemCubeBuilder.java:339)
>>> at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder.build(InM
>>> emCubeBuilder.java:166)
>>> at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder.build(InM
>>> emCubeBuilder.java:135)
>>> at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder.build(InM
>>> emCubeBuilder.java:122)
>>> at org.apache.kylin.cube.inmemcubing.AbstractInMemCubeBuilder$1
>>> .run(AbstractInMemCubeBuilder.java:80)
>>> at java.util.concurrent.Executors$RunnableAdapter.call(Executor
>>> s.java:471)
>>> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
>>> Executor.java:1145)
>>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
>>> lExecutor.java:615)
>>> at java.lang.Thread.run(Thread.java:745)
>>>
>>>
>>> *and the cube json is*
>>> {
>>> "uuid": "db91bcea-b33f-48af-a2f5-6014b14031f4",
>>> "last_modified": 1474511879506,
>>> "version": "1.5.4",
>>> "name": "hot_play_c",
>>> "model_name": "hot_play_cube",
>>> "description": "",
>>> "null_string": null,
>>> "dimensions": [
>>> {
>>> "name": "DEFAULT.HOT_PLAY.HOUR_START",
>>> "table": "DEFAULT.HOT_PLAY",
>>> "column": "HOUR_START",
>>> "derived": null
>>> },
>>> {
>>> "name": "DEFAULT.HOT_PLAY.MINUTE_START",
>>> "table": "DEFAULT.HOT_PLAY",
>>> "column": "MINUTE_START",
>>> "derived": null
>>> }
>>> ],
>>> "measures": [
>>> {
>>> "name": "_COUNT_",
>>> "function": {
>>> "expression": "COUNT",
>>> "parameter": {
>>> "type": "constant",
>>> "value": "1",
>>> "next_parameter": null
>>> },
>>> "returntype": "bigint"
>>> },
>>> "dependent_measure_ref": null
>>> },
>>> {
>>> "name": "COUNT_DISTINCT_USER",
>>> "function": {
>>> "expression": "COUNT_DISTINCT",
>>> "parameter": {
>>> "type": "column",
>>> "value": "USERID",
>>> "next_parameter": null
>>> },
>>> "returntype": "bitmap"
>>> },
>>> "dependent_measure_ref": null
>>> }
>>> ],
>>> "dictionaries": [],
>>> "rowkey": {
>>> "rowkey_columns": [
>>> {
>>> "column": "HOUR_START",
>>> "encoding": "time",
>>> "isShardBy": false
>>> },
>>> {
>>> "column": "MINUTE_START",
>>> "encoding": "time",
>>> "isShardBy": false
>>> }
>>> ]
>>> },
>>> "hbase_mapping": {
>>> "column_family": [
>>> {
>>> "name": "F1",
>>> "columns": [
>>> {
>>> "qualifier": "M",
>>> "measure_refs": [
>>> "_COUNT_"
>>> ]
>>> }
>>> ]
>>> },
>>> {
>>> "name": "F2",
>>> "columns": [
>>> {
>>> "qualifier": "M",
>>> "measure_refs": [
>>> "COUNT_DISTINCT_USER"
>>> ]
>>> }
>>> ]
>>> }
>>> ]
>>> },
>>> "aggregation_groups": [
>>> {
>>> "includes": [
>>> "HOUR_START",
>>> "MINUTE_START"
>>> ],
>>> "select_rule": {
>>> "hierarchy_dims": [],
>>> "mandatory_dims": [],
>>> "joint_dims": []
>>> }
>>> }
>>> ],
>>> "signature": "QXddyWCVVCYQcozxd4Zh2w==",
>>> "notify_list": [],
>>> "status_need_notify": [
>>> "ERROR",
>>> "DISCARDED",
>>> "SUCCEED"
>>> ],
>>> "partition_date_start": 0,
>>> "partition_date_end": 3153600000000,
>>> "auto_merge_time_ranges": [
>>> 604800000,
>>> 2419200000
>>> ],
>>> "retention_range": 0,
>>> "engine_type": 2,
>>> "storage_type": 2,
>>> "override_kylin_properties": {}
>>> }
>>>
>>> *no error after i change the returntype to hllc(16)*
>>>
>>> *i have struggled for several days. Any hints about this?*
>>>
>>> On Wed, Sep 21, 2016 at 10:47 PM, ShaoFeng Shi <sh...@apache.org>
>>> wrote:
>>>
>>>> Hi Tony,
>>>>
>>>> It seems your cube isn't partitioned (no partition date column
>>>> specified); please check or provide the cube JSON.
>>>>
>>>> 2016-09-21 0:30 GMT+08:00 Alberto Ramón <a....@gmail.com>:
>>>>
>>>>> I don't know but , can you check this change?: KYLIN-1744
>>>>> <https://issues.apache.org/jira/browse/KYLIN-1744> in V1.3
>>>>>
>>>>>
>>>>> 2016-09-20 14:50 GMT+02:00 Tony Lee <bt...@gmail.com>:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I was building cube from stream as the document(http://kylin.apache.o
>>>>>> rg/docs15/tutorial/cube_streaming.html
>>>>>>
>>>>>> ) says.
>>>>>>
>>>>>> I was using 1.5.3, and i encounter this error. Same error on 1.5.4.
>>>>>> Everything fine on 1.5.2.1.
>>>>>>
>>>>>> Any idea how to solve this?
>>>>>>
>>>>>>
>>>>>> 2016-09-20 20:31:51,520 INFO [main KafkaStreamingInput:129]: finish
>>>>>> to get streaming batch, total message count:30
>>>>>> 2016-09-20 20:31:51,532 DEBUG [main CubeManager:855]: Reloaded new
>>>>>> cube: STREAMING_CUBE with reference beingCUBE[name=STREAMING_CUBE] having 1
>>>>>> segments:KYLIN_2822I1W3CX
>>>>>> 2016-09-20 20:31:51,536 INFO [main CubeManager:314]: Updating cube
>>>>>> instance 'STREAMING_CUBE'
>>>>>> 2016-09-20 20:31:51,538 WARN [main StreamingCLI:127]: invalid
>>>>>> args:streaming start STREAMING_CUBE 1474374540000_1474374600000 -start
>>>>>> 1474374540000 -end 1474374600000 -cube STREAMING_CUBE
>>>>>> 2016-09-20 20:31:51,539 ERROR [main StreamingCLI:103]: error start
>>>>>> streaming
>>>>>> java.lang.IllegalStateException: Segments overlap:
>>>>>> STREAMING_CUBE[FULL_BUILD] and STREAMING_CUBE[FULL_BUILD]
>>>>>> at org.apache.kylin.cube.CubeValidator.validate(CubeValidator.j
>>>>>> ava:85)
>>>>>> at org.apache.kylin.cube.CubeManager.updateCubeWithRetry(CubeMa
>>>>>> nager.java:358)
>>>>>> at org.apache.kylin.cube.CubeManager.updateCube(CubeManager.java:301)
>>>>>> at org.apache.kylin.cube.CubeManager.appendSegment(CubeManager.
>>>>>> java:441)
>>>>>> at org.apache.kylin.engine.streaming.cube.StreamingCubeBuilder.
>>>>>> createBuildable(StreamingCubeBuilder.java:118)
>>>>>> at org.apache.kylin.engine.streaming.OneOffStreamingBuilder$1.r
>>>>>> un(OneOffStreamingBuilder.java:76)
>>>>>> at org.apache.kylin.engine.streaming.cli.StreamingCLI.startOneO
>>>>>> ffCubeStreaming(StreamingCLI.java:123)
>>>>>> at org.apache.kylin.engine.streaming.cli.StreamingCLI.main(Stre
>>>>>> amingCLI.java:97)
>>>>>> 2016-09-20 20:31:51,543 INFO [Thread-0 ConnectionManager$HConnectionImplementation:1678]:
>>>>>> Closing zookeeper sessionid=0x35708fbc2740013
>>>>>> 2016-09-20 20:31:51,549 INFO [Thread-0 ZooKeeper:684]: Session:
>>>>>> 0x35708fbc2740013 closed
>>>>>> 2016-09-20 20:31:51,549 INFO [main-EventThread ClientCnxn:512]:
>>>>>> EventThread shut down
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Best regards,
>>>>
>>>> Shaofeng Shi 史少锋
>>>>
>>>>
>>>
>>
>>
>> --
>> Best regards,
>>
>> Shaofeng Shi 史少锋
>>
>>
>
--
Best regards,
Shaofeng Shi 史少锋
Re: Error while building cube from stream
Posted by Tony Lee <bt...@gmail.com>.
Thanks
But this does not work on streaming cube.
I read some code and found that in class *StreamingCubeBuilder,* the
dictionary map was built by *DictionaryGenerator.buildDictionary()* instead
of *DictionaryManager.buildDictionary()*. Does this mean that streaming
cube does not support global dictionary?
I add USERID to the dimensions, then the cube was built successfully. But I
think the result will be incorrect if I calculate count distinct in
different segments. Is that right
Tony
On Sat, Sep 24, 2016 at 10:29 PM, ShaoFeng Shi <sh...@apache.org>
wrote:
> Hi Tony,
>
> The error was occurred when building a bitmap counter (for distinct
> count); from your cube descriptor, it seems there is no global dictionary
> be specified for the user id column. Please check this blog:
> https://kylin.apache.org/blog/2016/08/01/count-distinct-in-kylin/
>
> 2016-09-22 10:49 GMT+08:00 Tony Lee <bt...@gmail.com>:
>
>> Thanks, ShaoFeng Shi. That is the reason.
>>
>> But unfortunately, I have a new problem about count distinct (precisely)
>>
>> I added a streaming table on version 1.5.4 with my own json, which is
>> like this
>> {
>> "logTimestamp":1474456891127,
>> "datetime":"2016-09-21 19:21:31",
>> "uploadTime":"20160921192023",
>> "userId":"f2d28cbf9e21340a49e97063486db1f5",
>> "accountId":"84108490",
>> "otherfield":"...."
>> }
>>
>> *The error message while building the cube is*
>>
>> 2016-09-22 10:01:40,731 ERROR [main StreamingCLI:103]: error start
>> streaming
>> java.lang.RuntimeException: error build cube from StreamingBatch
>> at org.apache.kylin.engine.streaming.cube.StreamingCubeBuilder.
>> build(StreamingCubeBuilder.java:105)
>> at org.apache.kylin.engine.streaming.OneOffStreamingBuilder$1.
>> run(OneOffStreamingBuilder.java:79)
>> at org.apache.kylin.engine.streaming.cli.StreamingCLI.startOneO
>> ffCubeStreaming(StreamingCLI.java:123)
>> at org.apache.kylin.engine.streaming.cli.StreamingCLI.main(
>> StreamingCLI.java:97)
>> Caused by: java.lang.NullPointerException
>> at org.apache.kylin.measure.bitmap.BitmapMeasureType$1.valueOf(
>> BitmapMeasureType.java:100)
>> at org.apache.kylin.measure.bitmap.BitmapMeasureType$1.valueOf(
>> BitmapMeasureType.java:89)
>> at org.apache.kylin.cube.inmemcubing.InMemCubeBuilderInputConve
>> rter.buildValueOf(InMemCubeBuilderInputConverter.java:122)
>> at org.apache.kylin.cube.inmemcubing.InMemCubeBuilderInputConve
>> rter.buildValue(InMemCubeBuilderInputConverter.java:94)
>> at org.apache.kylin.cube.inmemcubing.InMemCubeBuilderInputConve
>> rter.convert(InMemCubeBuilderInputConverter.java:70)
>> at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder$InputConv
>> erter$1.next(InMemCubeBuilder.java:542)
>> at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder$InputConv
>> erter$1.next(InMemCubeBuilder.java:523)
>> at org.apache.kylin.gridtable.GTAggregateScanner.iterator(GTAgg
>> regateScanner.java:139)
>> at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder.createBas
>> eCuboid(InMemCubeBuilder.java:339)
>> at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder.build(
>> InMemCubeBuilder.java:166)
>> at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder.build(
>> InMemCubeBuilder.java:135)
>> at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder.build(
>> InMemCubeBuilder.java:122)
>> at org.apache.kylin.cube.inmemcubing.AbstractInMemCubeBuilder$
>> 1.run(AbstractInMemCubeBuilder.java:80)
>> at java.util.concurrent.Executors$RunnableAdapter.call(
>> Executors.java:471)
>> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
>> Executor.java:1145)
>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
>> lExecutor.java:615)
>> at java.lang.Thread.run(Thread.java:745)
>>
>>
>> *and the cube json is*
>> {
>> "uuid": "db91bcea-b33f-48af-a2f5-6014b14031f4",
>> "last_modified": 1474511879506,
>> "version": "1.5.4",
>> "name": "hot_play_c",
>> "model_name": "hot_play_cube",
>> "description": "",
>> "null_string": null,
>> "dimensions": [
>> {
>> "name": "DEFAULT.HOT_PLAY.HOUR_START",
>> "table": "DEFAULT.HOT_PLAY",
>> "column": "HOUR_START",
>> "derived": null
>> },
>> {
>> "name": "DEFAULT.HOT_PLAY.MINUTE_START",
>> "table": "DEFAULT.HOT_PLAY",
>> "column": "MINUTE_START",
>> "derived": null
>> }
>> ],
>> "measures": [
>> {
>> "name": "_COUNT_",
>> "function": {
>> "expression": "COUNT",
>> "parameter": {
>> "type": "constant",
>> "value": "1",
>> "next_parameter": null
>> },
>> "returntype": "bigint"
>> },
>> "dependent_measure_ref": null
>> },
>> {
>> "name": "COUNT_DISTINCT_USER",
>> "function": {
>> "expression": "COUNT_DISTINCT",
>> "parameter": {
>> "type": "column",
>> "value": "USERID",
>> "next_parameter": null
>> },
>> "returntype": "bitmap"
>> },
>> "dependent_measure_ref": null
>> }
>> ],
>> "dictionaries": [],
>> "rowkey": {
>> "rowkey_columns": [
>> {
>> "column": "HOUR_START",
>> "encoding": "time",
>> "isShardBy": false
>> },
>> {
>> "column": "MINUTE_START",
>> "encoding": "time",
>> "isShardBy": false
>> }
>> ]
>> },
>> "hbase_mapping": {
>> "column_family": [
>> {
>> "name": "F1",
>> "columns": [
>> {
>> "qualifier": "M",
>> "measure_refs": [
>> "_COUNT_"
>> ]
>> }
>> ]
>> },
>> {
>> "name": "F2",
>> "columns": [
>> {
>> "qualifier": "M",
>> "measure_refs": [
>> "COUNT_DISTINCT_USER"
>> ]
>> }
>> ]
>> }
>> ]
>> },
>> "aggregation_groups": [
>> {
>> "includes": [
>> "HOUR_START",
>> "MINUTE_START"
>> ],
>> "select_rule": {
>> "hierarchy_dims": [],
>> "mandatory_dims": [],
>> "joint_dims": []
>> }
>> }
>> ],
>> "signature": "QXddyWCVVCYQcozxd4Zh2w==",
>> "notify_list": [],
>> "status_need_notify": [
>> "ERROR",
>> "DISCARDED",
>> "SUCCEED"
>> ],
>> "partition_date_start": 0,
>> "partition_date_end": 3153600000000,
>> "auto_merge_time_ranges": [
>> 604800000,
>> 2419200000
>> ],
>> "retention_range": 0,
>> "engine_type": 2,
>> "storage_type": 2,
>> "override_kylin_properties": {}
>> }
>>
>> *no error after i change the returntype to hllc(16)*
>>
>> *i have struggled for several days. Any hints about this?*
>>
>> On Wed, Sep 21, 2016 at 10:47 PM, ShaoFeng Shi <sh...@apache.org>
>> wrote:
>>
>>> Hi Tony,
>>>
>>> It seems your cube isn't partitioned (no partition date column
>>> specified); please check or provide the cube JSON.
>>>
>>> 2016-09-21 0:30 GMT+08:00 Alberto Ramón <a....@gmail.com>:
>>>
>>>> I don't know but , can you check this change?: KYLIN-1744
>>>> <https://issues.apache.org/jira/browse/KYLIN-1744> in V1.3
>>>>
>>>>
>>>> 2016-09-20 14:50 GMT+02:00 Tony Lee <bt...@gmail.com>:
>>>>
>>>>> Hi,
>>>>>
>>>>> I was building cube from stream as the document(http://kylin.apache.o
>>>>> rg/docs15/tutorial/cube_streaming.html
>>>>>
>>>>> ) says.
>>>>>
>>>>> I was using 1.5.3, and i encounter this error. Same error on 1.5.4.
>>>>> Everything fine on 1.5.2.1.
>>>>>
>>>>> Any idea how to solve this?
>>>>>
>>>>>
>>>>> 2016-09-20 20:31:51,520 INFO [main KafkaStreamingInput:129]: finish
>>>>> to get streaming batch, total message count:30
>>>>> 2016-09-20 20:31:51,532 DEBUG [main CubeManager:855]: Reloaded new
>>>>> cube: STREAMING_CUBE with reference beingCUBE[name=STREAMING_CUBE] having 1
>>>>> segments:KYLIN_2822I1W3CX
>>>>> 2016-09-20 20:31:51,536 INFO [main CubeManager:314]: Updating cube
>>>>> instance 'STREAMING_CUBE'
>>>>> 2016-09-20 20:31:51,538 WARN [main StreamingCLI:127]: invalid
>>>>> args:streaming start STREAMING_CUBE 1474374540000_1474374600000 -start
>>>>> 1474374540000 -end 1474374600000 -cube STREAMING_CUBE
>>>>> 2016-09-20 20:31:51,539 ERROR [main StreamingCLI:103]: error start
>>>>> streaming
>>>>> java.lang.IllegalStateException: Segments overlap:
>>>>> STREAMING_CUBE[FULL_BUILD] and STREAMING_CUBE[FULL_BUILD]
>>>>> at org.apache.kylin.cube.CubeValidator.validate(CubeValidator.java:85)
>>>>> at org.apache.kylin.cube.CubeManager.updateCubeWithRetry(CubeMa
>>>>> nager.java:358)
>>>>> at org.apache.kylin.cube.CubeManager.updateCube(CubeManager.java:301)
>>>>> at org.apache.kylin.cube.CubeManager.appendSegment(CubeManager.
>>>>> java:441)
>>>>> at org.apache.kylin.engine.streaming.cube.StreamingCubeBuilder.
>>>>> createBuildable(StreamingCubeBuilder.java:118)
>>>>> at org.apache.kylin.engine.streaming.OneOffStreamingBuilder$1.r
>>>>> un(OneOffStreamingBuilder.java:76)
>>>>> at org.apache.kylin.engine.streaming.cli.StreamingCLI.startOneO
>>>>> ffCubeStreaming(StreamingCLI.java:123)
>>>>> at org.apache.kylin.engine.streaming.cli.StreamingCLI.main(Stre
>>>>> amingCLI.java:97)
>>>>> 2016-09-20 20:31:51,543 INFO [Thread-0 ConnectionManager$HConnectionImplementation:1678]:
>>>>> Closing zookeeper sessionid=0x35708fbc2740013
>>>>> 2016-09-20 20:31:51,549 INFO [Thread-0 ZooKeeper:684]: Session:
>>>>> 0x35708fbc2740013 closed
>>>>> 2016-09-20 20:31:51,549 INFO [main-EventThread ClientCnxn:512]:
>>>>> EventThread shut down
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Best regards,
>>>
>>> Shaofeng Shi 史少锋
>>>
>>>
>>
>
>
> --
> Best regards,
>
> Shaofeng Shi 史少锋
>
>
Re: Error while building cube from stream
Posted by ShaoFeng Shi <sh...@apache.org>.
Hi Tony,
The error was occurred when building a bitmap counter (for distinct count);
from your cube descriptor, it seems there is no global dictionary be
specified for the user id column. Please check this blog:
https://kylin.apache.org/blog/2016/08/01/count-distinct-in-kylin/
2016-09-22 10:49 GMT+08:00 Tony Lee <bt...@gmail.com>:
> Thanks, ShaoFeng Shi. That is the reason.
>
> But unfortunately, I have a new problem about count distinct (precisely)
>
> I added a streaming table on version 1.5.4 with my own json, which is
> like this
> {
> "logTimestamp":1474456891127,
> "datetime":"2016-09-21 19:21:31",
> "uploadTime":"20160921192023",
> "userId":"f2d28cbf9e21340a49e97063486db1f5",
> "accountId":"84108490",
> "otherfield":"...."
> }
>
> *The error message while building the cube is*
>
> 2016-09-22 10:01:40,731 ERROR [main StreamingCLI:103]: error start
> streaming
> java.lang.RuntimeException: error build cube from StreamingBatch
> at org.apache.kylin.engine.streaming.cube.
> StreamingCubeBuilder.build(StreamingCubeBuilder.java:105)
> at org.apache.kylin.engine.streaming.OneOffStreamingBuilder$1.run(
> OneOffStreamingBuilder.java:79)
> at org.apache.kylin.engine.streaming.cli.StreamingCLI.
> startOneOffCubeStreaming(StreamingCLI.java:123)
> at org.apache.kylin.engine.streaming.cli.StreamingCLI.
> main(StreamingCLI.java:97)
> Caused by: java.lang.NullPointerException
> at org.apache.kylin.measure.bitmap.BitmapMeasureType$1.
> valueOf(BitmapMeasureType.java:100)
> at org.apache.kylin.measure.bitmap.BitmapMeasureType$1.
> valueOf(BitmapMeasureType.java:89)
> at org.apache.kylin.cube.inmemcubing.
> InMemCubeBuilderInputConverter.buildValueOf(InMemCubeBuilderInputConverter
> .java:122)
> at org.apache.kylin.cube.inmemcubing.
> InMemCubeBuilderInputConverter.buildValue(InMemCubeBuilderInputConverter
> .java:94)
> at org.apache.kylin.cube.inmemcubing.
> InMemCubeBuilderInputConverter.convert(InMemCubeBuilderInputConverter
> .java:70)
> at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder$
> InputConverter$1.next(InMemCubeBuilder.java:542)
> at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder$
> InputConverter$1.next(InMemCubeBuilder.java:523)
> at org.apache.kylin.gridtable.GTAggregateScanner.iterator(
> GTAggregateScanner.java:139)
> at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder.
> createBaseCuboid(InMemCubeBuilder.java:339)
> at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder.
> build(InMemCubeBuilder.java:166)
> at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder.
> build(InMemCubeBuilder.java:135)
> at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder.
> build(InMemCubeBuilder.java:122)
> at org.apache.kylin.cube.inmemcubing.AbstractInMemCubeBuilder$1.
> run(AbstractInMemCubeBuilder.java:80)
> at java.util.concurrent.Executors$RunnableAdapter.
> call(Executors.java:471)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1145)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
>
>
> *and the cube json is*
> {
> "uuid": "db91bcea-b33f-48af-a2f5-6014b14031f4",
> "last_modified": 1474511879506,
> "version": "1.5.4",
> "name": "hot_play_c",
> "model_name": "hot_play_cube",
> "description": "",
> "null_string": null,
> "dimensions": [
> {
> "name": "DEFAULT.HOT_PLAY.HOUR_START",
> "table": "DEFAULT.HOT_PLAY",
> "column": "HOUR_START",
> "derived": null
> },
> {
> "name": "DEFAULT.HOT_PLAY.MINUTE_START",
> "table": "DEFAULT.HOT_PLAY",
> "column": "MINUTE_START",
> "derived": null
> }
> ],
> "measures": [
> {
> "name": "_COUNT_",
> "function": {
> "expression": "COUNT",
> "parameter": {
> "type": "constant",
> "value": "1",
> "next_parameter": null
> },
> "returntype": "bigint"
> },
> "dependent_measure_ref": null
> },
> {
> "name": "COUNT_DISTINCT_USER",
> "function": {
> "expression": "COUNT_DISTINCT",
> "parameter": {
> "type": "column",
> "value": "USERID",
> "next_parameter": null
> },
> "returntype": "bitmap"
> },
> "dependent_measure_ref": null
> }
> ],
> "dictionaries": [],
> "rowkey": {
> "rowkey_columns": [
> {
> "column": "HOUR_START",
> "encoding": "time",
> "isShardBy": false
> },
> {
> "column": "MINUTE_START",
> "encoding": "time",
> "isShardBy": false
> }
> ]
> },
> "hbase_mapping": {
> "column_family": [
> {
> "name": "F1",
> "columns": [
> {
> "qualifier": "M",
> "measure_refs": [
> "_COUNT_"
> ]
> }
> ]
> },
> {
> "name": "F2",
> "columns": [
> {
> "qualifier": "M",
> "measure_refs": [
> "COUNT_DISTINCT_USER"
> ]
> }
> ]
> }
> ]
> },
> "aggregation_groups": [
> {
> "includes": [
> "HOUR_START",
> "MINUTE_START"
> ],
> "select_rule": {
> "hierarchy_dims": [],
> "mandatory_dims": [],
> "joint_dims": []
> }
> }
> ],
> "signature": "QXddyWCVVCYQcozxd4Zh2w==",
> "notify_list": [],
> "status_need_notify": [
> "ERROR",
> "DISCARDED",
> "SUCCEED"
> ],
> "partition_date_start": 0,
> "partition_date_end": 3153600000000,
> "auto_merge_time_ranges": [
> 604800000,
> 2419200000
> ],
> "retention_range": 0,
> "engine_type": 2,
> "storage_type": 2,
> "override_kylin_properties": {}
> }
>
> *no error after i change the returntype to hllc(16)*
>
> *i have struggled for several days. Any hints about this?*
>
> On Wed, Sep 21, 2016 at 10:47 PM, ShaoFeng Shi <sh...@apache.org>
> wrote:
>
>> Hi Tony,
>>
>> It seems your cube isn't partitioned (no partition date column
>> specified); please check or provide the cube JSON.
>>
>> 2016-09-21 0:30 GMT+08:00 Alberto Ramón <a....@gmail.com>:
>>
>>> I don't know but , can you check this change?: KYLIN-1744
>>> <https://issues.apache.org/jira/browse/KYLIN-1744> in V1.3
>>>
>>>
>>> 2016-09-20 14:50 GMT+02:00 Tony Lee <bt...@gmail.com>:
>>>
>>>> Hi,
>>>>
>>>> I was building cube from stream as the document(http://kylin.apache.o
>>>> rg/docs15/tutorial/cube_streaming.html
>>>>
>>>> ) says.
>>>>
>>>> I was using 1.5.3, and i encounter this error. Same error on 1.5.4.
>>>> Everything fine on 1.5.2.1.
>>>>
>>>> Any idea how to solve this?
>>>>
>>>>
>>>> 2016-09-20 20:31:51,520 INFO [main KafkaStreamingInput:129]: finish to
>>>> get streaming batch, total message count:30
>>>> 2016-09-20 20:31:51,532 DEBUG [main CubeManager:855]: Reloaded new
>>>> cube: STREAMING_CUBE with reference beingCUBE[name=STREAMING_CUBE] having 1
>>>> segments:KYLIN_2822I1W3CX
>>>> 2016-09-20 20:31:51,536 INFO [main CubeManager:314]: Updating cube
>>>> instance 'STREAMING_CUBE'
>>>> 2016-09-20 20:31:51,538 WARN [main StreamingCLI:127]: invalid
>>>> args:streaming start STREAMING_CUBE 1474374540000_1474374600000 -start
>>>> 1474374540000 -end 1474374600000 -cube STREAMING_CUBE
>>>> 2016-09-20 20:31:51,539 ERROR [main StreamingCLI:103]: error start
>>>> streaming
>>>> java.lang.IllegalStateException: Segments overlap:
>>>> STREAMING_CUBE[FULL_BUILD] and STREAMING_CUBE[FULL_BUILD]
>>>> at org.apache.kylin.cube.CubeValidator.validate(CubeValidator.java:85)
>>>> at org.apache.kylin.cube.CubeManager.updateCubeWithRetry(CubeMa
>>>> nager.java:358)
>>>> at org.apache.kylin.cube.CubeManager.updateCube(CubeManager.java:301)
>>>> at org.apache.kylin.cube.CubeManager.appendSegment(CubeManager.
>>>> java:441)
>>>> at org.apache.kylin.engine.streaming.cube.StreamingCubeBuilder.
>>>> createBuildable(StreamingCubeBuilder.java:118)
>>>> at org.apache.kylin.engine.streaming.OneOffStreamingBuilder$1.r
>>>> un(OneOffStreamingBuilder.java:76)
>>>> at org.apache.kylin.engine.streaming.cli.StreamingCLI.startOneO
>>>> ffCubeStreaming(StreamingCLI.java:123)
>>>> at org.apache.kylin.engine.streaming.cli.StreamingCLI.main(Stre
>>>> amingCLI.java:97)
>>>> 2016-09-20 20:31:51,543 INFO [Thread-0 ConnectionManager$HConnectionImplementation:1678]:
>>>> Closing zookeeper sessionid=0x35708fbc2740013
>>>> 2016-09-20 20:31:51,549 INFO [Thread-0 ZooKeeper:684]: Session:
>>>> 0x35708fbc2740013 closed
>>>> 2016-09-20 20:31:51,549 INFO [main-EventThread ClientCnxn:512]:
>>>> EventThread shut down
>>>>
>>>>
>>>
>>
>>
>> --
>> Best regards,
>>
>> Shaofeng Shi 史少锋
>>
>>
>
--
Best regards,
Shaofeng Shi 史少锋
Re: Error while building cube from stream
Posted by Tony Lee <bt...@gmail.com>.
Thanks, ShaoFeng Shi. That is the reason.
But unfortunately, I have a new problem about count distinct (precisely)
I added a streaming table on version 1.5.4 with my own json, which is like
this
{
"logTimestamp":1474456891127,
"datetime":"2016-09-21 19:21:31",
"uploadTime":"20160921192023",
"userId":"f2d28cbf9e21340a49e97063486db1f5",
"accountId":"84108490",
"otherfield":"...."
}
*The error message while building the cube is*
2016-09-22 10:01:40,731 ERROR [main StreamingCLI:103]: error start streaming
java.lang.RuntimeException: error build cube from StreamingBatch
at
org.apache.kylin.engine.streaming.cube.StreamingCubeBuilder.build(StreamingCubeBuilder.java:105)
at
org.apache.kylin.engine.streaming.OneOffStreamingBuilder$1.run(OneOffStreamingBuilder.java:79)
at
org.apache.kylin.engine.streaming.cli.StreamingCLI.startOneOffCubeStreaming(StreamingCLI.java:123)
at
org.apache.kylin.engine.streaming.cli.StreamingCLI.main(StreamingCLI.java:97)
Caused by: java.lang.NullPointerException
at
org.apache.kylin.measure.bitmap.BitmapMeasureType$1.valueOf(BitmapMeasureType.java:100)
at
org.apache.kylin.measure.bitmap.BitmapMeasureType$1.valueOf(BitmapMeasureType.java:89)
at
org.apache.kylin.cube.inmemcubing.InMemCubeBuilderInputConverter.buildValueOf(InMemCubeBuilderInputConverter.java:122)
at
org.apache.kylin.cube.inmemcubing.InMemCubeBuilderInputConverter.buildValue(InMemCubeBuilderInputConverter.java:94)
at
org.apache.kylin.cube.inmemcubing.InMemCubeBuilderInputConverter.convert(InMemCubeBuilderInputConverter.java:70)
at
org.apache.kylin.cube.inmemcubing.InMemCubeBuilder$InputConverter$1.next(InMemCubeBuilder.java:542)
at
org.apache.kylin.cube.inmemcubing.InMemCubeBuilder$InputConverter$1.next(InMemCubeBuilder.java:523)
at
org.apache.kylin.gridtable.GTAggregateScanner.iterator(GTAggregateScanner.java:139)
at
org.apache.kylin.cube.inmemcubing.InMemCubeBuilder.createBaseCuboid(InMemCubeBuilder.java:339)
at
org.apache.kylin.cube.inmemcubing.InMemCubeBuilder.build(InMemCubeBuilder.java:166)
at
org.apache.kylin.cube.inmemcubing.InMemCubeBuilder.build(InMemCubeBuilder.java:135)
at
org.apache.kylin.cube.inmemcubing.InMemCubeBuilder.build(InMemCubeBuilder.java:122)
at
org.apache.kylin.cube.inmemcubing.AbstractInMemCubeBuilder$1.run(AbstractInMemCubeBuilder.java:80)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
*and the cube json is*
{
"uuid": "db91bcea-b33f-48af-a2f5-6014b14031f4",
"last_modified": 1474511879506,
"version": "1.5.4",
"name": "hot_play_c",
"model_name": "hot_play_cube",
"description": "",
"null_string": null,
"dimensions": [
{
"name": "DEFAULT.HOT_PLAY.HOUR_START",
"table": "DEFAULT.HOT_PLAY",
"column": "HOUR_START",
"derived": null
},
{
"name": "DEFAULT.HOT_PLAY.MINUTE_START",
"table": "DEFAULT.HOT_PLAY",
"column": "MINUTE_START",
"derived": null
}
],
"measures": [
{
"name": "_COUNT_",
"function": {
"expression": "COUNT",
"parameter": {
"type": "constant",
"value": "1",
"next_parameter": null
},
"returntype": "bigint"
},
"dependent_measure_ref": null
},
{
"name": "COUNT_DISTINCT_USER",
"function": {
"expression": "COUNT_DISTINCT",
"parameter": {
"type": "column",
"value": "USERID",
"next_parameter": null
},
"returntype": "bitmap"
},
"dependent_measure_ref": null
}
],
"dictionaries": [],
"rowkey": {
"rowkey_columns": [
{
"column": "HOUR_START",
"encoding": "time",
"isShardBy": false
},
{
"column": "MINUTE_START",
"encoding": "time",
"isShardBy": false
}
]
},
"hbase_mapping": {
"column_family": [
{
"name": "F1",
"columns": [
{
"qualifier": "M",
"measure_refs": [
"_COUNT_"
]
}
]
},
{
"name": "F2",
"columns": [
{
"qualifier": "M",
"measure_refs": [
"COUNT_DISTINCT_USER"
]
}
]
}
]
},
"aggregation_groups": [
{
"includes": [
"HOUR_START",
"MINUTE_START"
],
"select_rule": {
"hierarchy_dims": [],
"mandatory_dims": [],
"joint_dims": []
}
}
],
"signature": "QXddyWCVVCYQcozxd4Zh2w==",
"notify_list": [],
"status_need_notify": [
"ERROR",
"DISCARDED",
"SUCCEED"
],
"partition_date_start": 0,
"partition_date_end": 3153600000000,
"auto_merge_time_ranges": [
604800000,
2419200000
],
"retention_range": 0,
"engine_type": 2,
"storage_type": 2,
"override_kylin_properties": {}
}
*no error after i change the returntype to hllc(16)*
*i have struggled for several days. Any hints about this?*
On Wed, Sep 21, 2016 at 10:47 PM, ShaoFeng Shi <sh...@apache.org>
wrote:
> Hi Tony,
>
> It seems your cube isn't partitioned (no partition date column specified);
> please check or provide the cube JSON.
>
> 2016-09-21 0:30 GMT+08:00 Alberto Ramón <a....@gmail.com>:
>
>> I don't know but , can you check this change?: KYLIN-1744
>> <https://issues.apache.org/jira/browse/KYLIN-1744> in V1.3
>>
>>
>> 2016-09-20 14:50 GMT+02:00 Tony Lee <bt...@gmail.com>:
>>
>>> Hi,
>>>
>>> I was building cube from stream as the document(http://kylin.apache.o
>>> rg/docs15/tutorial/cube_streaming.html
>>>
>>> ) says.
>>>
>>> I was using 1.5.3, and i encounter this error. Same error on 1.5.4.
>>> Everything fine on 1.5.2.1.
>>>
>>> Any idea how to solve this?
>>>
>>>
>>> 2016-09-20 20:31:51,520 INFO [main KafkaStreamingInput:129]: finish to
>>> get streaming batch, total message count:30
>>> 2016-09-20 20:31:51,532 DEBUG [main CubeManager:855]: Reloaded new cube:
>>> STREAMING_CUBE with reference beingCUBE[name=STREAMING_CUBE] having 1
>>> segments:KYLIN_2822I1W3CX
>>> 2016-09-20 20:31:51,536 INFO [main CubeManager:314]: Updating cube
>>> instance 'STREAMING_CUBE'
>>> 2016-09-20 20:31:51,538 WARN [main StreamingCLI:127]: invalid
>>> args:streaming start STREAMING_CUBE 1474374540000_1474374600000 -start
>>> 1474374540000 -end 1474374600000 -cube STREAMING_CUBE
>>> 2016-09-20 20:31:51,539 ERROR [main StreamingCLI:103]: error start
>>> streaming
>>> java.lang.IllegalStateException: Segments overlap:
>>> STREAMING_CUBE[FULL_BUILD] and STREAMING_CUBE[FULL_BUILD]
>>> at org.apache.kylin.cube.CubeValidator.validate(CubeValidator.java:85)
>>> at org.apache.kylin.cube.CubeManager.updateCubeWithRetry(CubeMa
>>> nager.java:358)
>>> at org.apache.kylin.cube.CubeManager.updateCube(CubeManager.java:301)
>>> at org.apache.kylin.cube.CubeManager.appendSegment(CubeManager.java:441)
>>> at org.apache.kylin.engine.streaming.cube.StreamingCubeBuilder.
>>> createBuildable(StreamingCubeBuilder.java:118)
>>> at org.apache.kylin.engine.streaming.OneOffStreamingBuilder$1.r
>>> un(OneOffStreamingBuilder.java:76)
>>> at org.apache.kylin.engine.streaming.cli.StreamingCLI.startOneO
>>> ffCubeStreaming(StreamingCLI.java:123)
>>> at org.apache.kylin.engine.streaming.cli.StreamingCLI.main(Stre
>>> amingCLI.java:97)
>>> 2016-09-20 20:31:51,543 INFO [Thread-0 ConnectionManager$HConnectionImplementation:1678]:
>>> Closing zookeeper sessionid=0x35708fbc2740013
>>> 2016-09-20 20:31:51,549 INFO [Thread-0 ZooKeeper:684]: Session:
>>> 0x35708fbc2740013 closed
>>> 2016-09-20 20:31:51,549 INFO [main-EventThread ClientCnxn:512]:
>>> EventThread shut down
>>>
>>>
>>
>
>
> --
> Best regards,
>
> Shaofeng Shi 史少锋
>
>
Re: Error while building cube from stream
Posted by ShaoFeng Shi <sh...@apache.org>.
Hi Tony,
It seems your cube isn't partitioned (no partition date column specified);
please check or provide the cube JSON.
2016-09-21 0:30 GMT+08:00 Alberto Ramón <a....@gmail.com>:
> I don't know but , can you check this change?: KYLIN-1744
> <https://issues.apache.org/jira/browse/KYLIN-1744> in V1.3
>
>
> 2016-09-20 14:50 GMT+02:00 Tony Lee <bt...@gmail.com>:
>
>> Hi,
>>
>> I was building cube from stream as the document(http://kylin.apache.o
>> rg/docs15/tutorial/cube_streaming.html
>>
>> ) says.
>>
>> I was using 1.5.3, and i encounter this error. Same error on 1.5.4.
>> Everything fine on 1.5.2.1.
>>
>> Any idea how to solve this?
>>
>>
>> 2016-09-20 20:31:51,520 INFO [main KafkaStreamingInput:129]: finish to
>> get streaming batch, total message count:30
>> 2016-09-20 20:31:51,532 DEBUG [main CubeManager:855]: Reloaded new cube:
>> STREAMING_CUBE with reference beingCUBE[name=STREAMING_CUBE] having 1
>> segments:KYLIN_2822I1W3CX
>> 2016-09-20 20:31:51,536 INFO [main CubeManager:314]: Updating cube
>> instance 'STREAMING_CUBE'
>> 2016-09-20 20:31:51,538 WARN [main StreamingCLI:127]: invalid
>> args:streaming start STREAMING_CUBE 1474374540000_1474374600000 -start
>> 1474374540000 -end 1474374600000 -cube STREAMING_CUBE
>> 2016-09-20 20:31:51,539 ERROR [main StreamingCLI:103]: error start
>> streaming
>> java.lang.IllegalStateException: Segments overlap:
>> STREAMING_CUBE[FULL_BUILD] and STREAMING_CUBE[FULL_BUILD]
>> at org.apache.kylin.cube.CubeValidator.validate(CubeValidator.java:85)
>> at org.apache.kylin.cube.CubeManager.updateCubeWithRetry(CubeMa
>> nager.java:358)
>> at org.apache.kylin.cube.CubeManager.updateCube(CubeManager.java:301)
>> at org.apache.kylin.cube.CubeManager.appendSegment(CubeManager.java:441)
>> at org.apache.kylin.engine.streaming.cube.StreamingCubeBuilder.
>> createBuildable(StreamingCubeBuilder.java:118)
>> at org.apache.kylin.engine.streaming.OneOffStreamingBuilder$1.
>> run(OneOffStreamingBuilder.java:76)
>> at org.apache.kylin.engine.streaming.cli.StreamingCLI.startOneO
>> ffCubeStreaming(StreamingCLI.java:123)
>> at org.apache.kylin.engine.streaming.cli.StreamingCLI.main(
>> StreamingCLI.java:97)
>> 2016-09-20 20:31:51,543 INFO [Thread-0 ConnectionManager$HConnectionImplementation:1678]:
>> Closing zookeeper sessionid=0x35708fbc2740013
>> 2016-09-20 20:31:51,549 INFO [Thread-0 ZooKeeper:684]: Session:
>> 0x35708fbc2740013 closed
>> 2016-09-20 20:31:51,549 INFO [main-EventThread ClientCnxn:512]:
>> EventThread shut down
>>
>>
>
--
Best regards,
Shaofeng Shi 史少锋
Re: Error while building cube from stream
Posted by Alberto Ramón <a....@gmail.com>.
I don't know but , can you check this change?: KYLIN-1744
<https://issues.apache.org/jira/browse/KYLIN-1744> in V1.3
2016-09-20 14:50 GMT+02:00 Tony Lee <bt...@gmail.com>:
> Hi,
>
> I was building cube from stream as the document(http://kylin.apache.
> org/docs15/tutorial/cube_streaming.html
>
> ) says.
>
> I was using 1.5.3, and i encounter this error. Same error on 1.5.4.
> Everything fine on 1.5.2.1.
>
> Any idea how to solve this?
>
>
> 2016-09-20 20:31:51,520 INFO [main KafkaStreamingInput:129]: finish to
> get streaming batch, total message count:30
> 2016-09-20 20:31:51,532 DEBUG [main CubeManager:855]: Reloaded new cube:
> STREAMING_CUBE with reference beingCUBE[name=STREAMING_CUBE] having 1
> segments:KYLIN_2822I1W3CX
> 2016-09-20 20:31:51,536 INFO [main CubeManager:314]: Updating cube
> instance 'STREAMING_CUBE'
> 2016-09-20 20:31:51,538 WARN [main StreamingCLI:127]: invalid
> args:streaming start STREAMING_CUBE 1474374540000_1474374600000 -start
> 1474374540000 -end 1474374600000 -cube STREAMING_CUBE
> 2016-09-20 20:31:51,539 ERROR [main StreamingCLI:103]: error start
> streaming
> java.lang.IllegalStateException: Segments overlap:
> STREAMING_CUBE[FULL_BUILD] and STREAMING_CUBE[FULL_BUILD]
> at org.apache.kylin.cube.CubeValidator.validate(CubeValidator.java:85)
> at org.apache.kylin.cube.CubeManager.updateCubeWithRetry(
> CubeManager.java:358)
> at org.apache.kylin.cube.CubeManager.updateCube(CubeManager.java:301)
> at org.apache.kylin.cube.CubeManager.appendSegment(CubeManager.java:441)
> at org.apache.kylin.engine.streaming.cube.StreamingCubeBuilder.
> createBuildable(StreamingCubeBuilder.java:118)
> at org.apache.kylin.engine.streaming.OneOffStreamingBuilder$1.run(
> OneOffStreamingBuilder.java:76)
> at org.apache.kylin.engine.streaming.cli.StreamingCLI.
> startOneOffCubeStreaming(StreamingCLI.java:123)
> at org.apache.kylin.engine.streaming.cli.StreamingCLI.
> main(StreamingCLI.java:97)
> 2016-09-20 20:31:51,543 INFO [Thread-0 ConnectionManager$
> HConnectionImplementation:1678]: Closing zookeeper
> sessionid=0x35708fbc2740013
> 2016-09-20 20:31:51,549 INFO [Thread-0 ZooKeeper:684]: Session:
> 0x35708fbc2740013 closed
> 2016-09-20 20:31:51,549 INFO [main-EventThread ClientCnxn:512]:
> EventThread shut down
>
>