You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@kylin.apache.org by Tony Lee <bt...@gmail.com> on 2016/09/20 12:50:28 UTC

Error while building cube from stream

Hi,

I was building cube from stream as the document(
http://kylin.apache.org/docs15/tutorial/cube_streaming.html

) says.

I was using 1.5.3, and i encounter this error. Same error on 1.5.4.
Everything fine on 1.5.2.1.

Any idea how to solve this?


2016-09-20 20:31:51,520 INFO  [main KafkaStreamingInput:129]: finish to get
streaming batch, total message count:30
2016-09-20 20:31:51,532 DEBUG [main CubeManager:855]: Reloaded new cube:
STREAMING_CUBE with reference beingCUBE[name=STREAMING_CUBE] having 1
segments:KYLIN_2822I1W3CX
2016-09-20 20:31:51,536 INFO  [main CubeManager:314]: Updating cube
instance 'STREAMING_CUBE'
2016-09-20 20:31:51,538 WARN  [main StreamingCLI:127]: invalid
args:streaming start STREAMING_CUBE 1474374540000_1474374600000 -start
1474374540000 -end 1474374600000 -cube STREAMING_CUBE
2016-09-20 20:31:51,539 ERROR [main StreamingCLI:103]: error start streaming
java.lang.IllegalStateException: Segments overlap:
STREAMING_CUBE[FULL_BUILD] and STREAMING_CUBE[FULL_BUILD]
at org.apache.kylin.cube.CubeValidator.validate(CubeValidator.java:85)
at
org.apache.kylin.cube.CubeManager.updateCubeWithRetry(CubeManager.java:358)
at org.apache.kylin.cube.CubeManager.updateCube(CubeManager.java:301)
at org.apache.kylin.cube.CubeManager.appendSegment(CubeManager.java:441)
at
org.apache.kylin.engine.streaming.cube.StreamingCubeBuilder.createBuildable(StreamingCubeBuilder.java:118)
at
org.apache.kylin.engine.streaming.OneOffStreamingBuilder$1.run(OneOffStreamingBuilder.java:76)
at
org.apache.kylin.engine.streaming.cli.StreamingCLI.startOneOffCubeStreaming(StreamingCLI.java:123)
at
org.apache.kylin.engine.streaming.cli.StreamingCLI.main(StreamingCLI.java:97)
2016-09-20 20:31:51,543 INFO  [Thread-0
ConnectionManager$HConnectionImplementation:1678]: Closing zookeeper
sessionid=0x35708fbc2740013
2016-09-20 20:31:51,549 INFO  [Thread-0 ZooKeeper:684]: Session:
0x35708fbc2740013 closed
2016-09-20 20:31:51,549 INFO  [main-EventThread ClientCnxn:512]:
EventThread shut down

Re: Error while building cube from stream

Posted by Tony Lee <bt...@gmail.com>.
Thanks for you replying.

I have create an issue here.
https://issues.apache.org/jira/browse/KYLIN-2053


On Mon, Sep 26, 2016 at 4:59 PM, ShaoFeng Shi <sh...@apache.org>
wrote:

> Hi Tony,
>
> You're correct; The global dictionary wasn't supported in stream builder
> (this is the first reporting); Could you please open a JIRA?
> https://issues.apache.org/jira/secure/Dashboard.jspa
>
> BTW, we're developing the new version of streaming engine, which will
> reuse most of the logic of batch cubing engine, planned to roll out in
> v1.6. I believe with the new design there will have no such issue.
>
> 2016-09-26 14:56 GMT+08:00 Tony Lee <bt...@gmail.com>:
>
>> Thanks
>>
>> But this does not work on streaming cube.
>>
>> I read some code and found that in class *StreamingCubeBuilder,* the
>> dictionary map was built by *DictionaryGenerator.buildDictionary()*
>> instead of *DictionaryManager.buildDictionary()*. Does this mean that
>> streaming cube does not support global dictionary?
>>
>> I add USERID to the dimensions, then the cube was built successfully. But
>> I think the result will be incorrect if I calculate count distinct in
>> different segments. Is that right
>>
>>
>> Tony
>>
>> On Sat, Sep 24, 2016 at 10:29 PM, ShaoFeng Shi <sh...@apache.org>
>> wrote:
>>
>>> Hi Tony,
>>>
>>> The error was occurred when building a bitmap counter (for distinct
>>> count); from your cube descriptor, it seems there is no global dictionary
>>> be specified for the user id column. Please check this blog:
>>> https://kylin.apache.org/blog/2016/08/01/count-distinct-in-kylin/
>>>
>>> 2016-09-22 10:49 GMT+08:00 Tony Lee <bt...@gmail.com>:
>>>
>>>> Thanks, ShaoFeng Shi. That is the reason.
>>>>
>>>> But unfortunately, I have a new problem about count distinct
>>>> (precisely)
>>>>
>>>> I  added a streaming table on version 1.5.4 with my own json, which is
>>>> like this
>>>> {
>>>>     "logTimestamp":1474456891127,
>>>>     "datetime":"2016-09-21 19:21:31",
>>>>     "uploadTime":"20160921192023",
>>>>     "userId":"f2d28cbf9e21340a49e97063486db1f5",
>>>>     "accountId":"84108490",
>>>>     "otherfield":"...."
>>>> }
>>>>
>>>> *The error message while building the cube is*
>>>>
>>>> 2016-09-22 10:01:40,731 ERROR [main StreamingCLI:103]: error start
>>>> streaming
>>>> java.lang.RuntimeException: error build cube from StreamingBatch
>>>>         at org.apache.kylin.engine.streaming.cube.StreamingCubeBuilder.
>>>> build(StreamingCubeBuilder.java:105)
>>>>         at org.apache.kylin.engine.streaming.OneOffStreamingBuilder$1.r
>>>> un(OneOffStreamingBuilder.java:79)
>>>>         at org.apache.kylin.engine.streaming.cli.StreamingCLI.startOneO
>>>> ffCubeStreaming(StreamingCLI.java:123)
>>>>         at org.apache.kylin.engine.streaming.cli.StreamingCLI.main(Stre
>>>> amingCLI.java:97)
>>>> Caused by: java.lang.NullPointerException
>>>>         at org.apache.kylin.measure.bitmap.BitmapMeasureType$1.valueOf(
>>>> BitmapMeasureType.java:100)
>>>>         at org.apache.kylin.measure.bitmap.BitmapMeasureType$1.valueOf(
>>>> BitmapMeasureType.java:89)
>>>>         at org.apache.kylin.cube.inmemcubing.InMemCubeBuilderInputConve
>>>> rter.buildValueOf(InMemCubeBuilderInputConverter.java:122)
>>>>         at org.apache.kylin.cube.inmemcubing.InMemCubeBuilderInputConve
>>>> rter.buildValue(InMemCubeBuilderInputConverter.java:94)
>>>>         at org.apache.kylin.cube.inmemcubing.InMemCubeBuilderInputConve
>>>> rter.convert(InMemCubeBuilderInputConverter.java:70)
>>>>         at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder$InputConv
>>>> erter$1.next(InMemCubeBuilder.java:542)
>>>>         at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder$InputConv
>>>> erter$1.next(InMemCubeBuilder.java:523)
>>>>         at org.apache.kylin.gridtable.GTAggregateScanner.iterator(GTAgg
>>>> regateScanner.java:139)
>>>>         at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder.createBas
>>>> eCuboid(InMemCubeBuilder.java:339)
>>>>         at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder.build(InM
>>>> emCubeBuilder.java:166)
>>>>         at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder.build(InM
>>>> emCubeBuilder.java:135)
>>>>         at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder.build(InM
>>>> emCubeBuilder.java:122)
>>>>         at org.apache.kylin.cube.inmemcubing.AbstractInMemCubeBuilder$1
>>>> .run(AbstractInMemCubeBuilder.java:80)
>>>>         at java.util.concurrent.Executors$RunnableAdapter.call(Executor
>>>> s.java:471)
>>>>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>>>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
>>>> Executor.java:1145)
>>>>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
>>>> lExecutor.java:615)
>>>>         at java.lang.Thread.run(Thread.java:745)
>>>>
>>>>
>>>> *and the cube json is*
>>>> {
>>>>   "uuid": "db91bcea-b33f-48af-a2f5-6014b14031f4",
>>>>   "last_modified": 1474511879506,
>>>>   "version": "1.5.4",
>>>>   "name": "hot_play_c",
>>>>   "model_name": "hot_play_cube",
>>>>   "description": "",
>>>>   "null_string": null,
>>>>   "dimensions": [
>>>>     {
>>>>       "name": "DEFAULT.HOT_PLAY.HOUR_START",
>>>>       "table": "DEFAULT.HOT_PLAY",
>>>>       "column": "HOUR_START",
>>>>       "derived": null
>>>>     },
>>>>     {
>>>>       "name": "DEFAULT.HOT_PLAY.MINUTE_START",
>>>>       "table": "DEFAULT.HOT_PLAY",
>>>>       "column": "MINUTE_START",
>>>>       "derived": null
>>>>     }
>>>>   ],
>>>>   "measures": [
>>>>     {
>>>>       "name": "_COUNT_",
>>>>       "function": {
>>>>         "expression": "COUNT",
>>>>         "parameter": {
>>>>           "type": "constant",
>>>>           "value": "1",
>>>>           "next_parameter": null
>>>>         },
>>>>         "returntype": "bigint"
>>>>       },
>>>>       "dependent_measure_ref": null
>>>>     },
>>>>     {
>>>>       "name": "COUNT_DISTINCT_USER",
>>>>       "function": {
>>>>         "expression": "COUNT_DISTINCT",
>>>>         "parameter": {
>>>>           "type": "column",
>>>>           "value": "USERID",
>>>>           "next_parameter": null
>>>>         },
>>>>         "returntype": "bitmap"
>>>>       },
>>>>       "dependent_measure_ref": null
>>>>     }
>>>>   ],
>>>>   "dictionaries": [],
>>>>   "rowkey": {
>>>>     "rowkey_columns": [
>>>>       {
>>>>         "column": "HOUR_START",
>>>>         "encoding": "time",
>>>>         "isShardBy": false
>>>>       },
>>>>       {
>>>>         "column": "MINUTE_START",
>>>>         "encoding": "time",
>>>>         "isShardBy": false
>>>>       }
>>>>     ]
>>>>   },
>>>>   "hbase_mapping": {
>>>>     "column_family": [
>>>>       {
>>>>         "name": "F1",
>>>>         "columns": [
>>>>           {
>>>>             "qualifier": "M",
>>>>             "measure_refs": [
>>>>               "_COUNT_"
>>>>             ]
>>>>           }
>>>>         ]
>>>>       },
>>>>       {
>>>>         "name": "F2",
>>>>         "columns": [
>>>>           {
>>>>             "qualifier": "M",
>>>>             "measure_refs": [
>>>>               "COUNT_DISTINCT_USER"
>>>>             ]
>>>>           }
>>>>         ]
>>>>       }
>>>>     ]
>>>>   },
>>>>   "aggregation_groups": [
>>>>     {
>>>>       "includes": [
>>>>         "HOUR_START",
>>>>         "MINUTE_START"
>>>>       ],
>>>>       "select_rule": {
>>>>         "hierarchy_dims": [],
>>>>         "mandatory_dims": [],
>>>>         "joint_dims": []
>>>>       }
>>>>     }
>>>>   ],
>>>>   "signature": "QXddyWCVVCYQcozxd4Zh2w==",
>>>>   "notify_list": [],
>>>>   "status_need_notify": [
>>>>     "ERROR",
>>>>     "DISCARDED",
>>>>     "SUCCEED"
>>>>   ],
>>>>   "partition_date_start": 0,
>>>>   "partition_date_end": 3153600000000,
>>>>   "auto_merge_time_ranges": [
>>>>     604800000,
>>>>     2419200000
>>>>   ],
>>>>   "retention_range": 0,
>>>>   "engine_type": 2,
>>>>   "storage_type": 2,
>>>>   "override_kylin_properties": {}
>>>> }
>>>>
>>>> *no error after i change the returntype to hllc(16)*
>>>>
>>>> *i have struggled for several days. Any hints about this?*
>>>>
>>>> On Wed, Sep 21, 2016 at 10:47 PM, ShaoFeng Shi <sh...@apache.org>
>>>> wrote:
>>>>
>>>>> Hi Tony,
>>>>>
>>>>> It seems your cube isn't partitioned (no partition date column
>>>>> specified); please check or provide the cube JSON.
>>>>>
>>>>> 2016-09-21 0:30 GMT+08:00 Alberto Ramón <a....@gmail.com>:
>>>>>
>>>>>> I don't know but , can you check this change?: KYLIN-1744
>>>>>> <https://issues.apache.org/jira/browse/KYLIN-1744> in V1.3
>>>>>>
>>>>>>
>>>>>> 2016-09-20 14:50 GMT+02:00 Tony Lee <bt...@gmail.com>:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I was building cube from stream as the document(
>>>>>>> http://kylin.apache.org/docs15/tutorial/cube_streaming.html
>>>>>>>
>>>>>>> ) says.
>>>>>>>
>>>>>>> I was using 1.5.3, and i encounter this error. Same error on 1.5.4.
>>>>>>> Everything fine on 1.5.2.1.
>>>>>>>
>>>>>>> Any idea how to solve this?
>>>>>>>
>>>>>>>
>>>>>>> 2016-09-20 20:31:51,520 INFO  [main KafkaStreamingInput:129]: finish
>>>>>>> to get streaming batch, total message count:30
>>>>>>> 2016-09-20 20:31:51,532 DEBUG [main CubeManager:855]: Reloaded new
>>>>>>> cube: STREAMING_CUBE with reference beingCUBE[name=STREAMING_CUBE] having 1
>>>>>>> segments:KYLIN_2822I1W3CX
>>>>>>> 2016-09-20 20:31:51,536 INFO  [main CubeManager:314]: Updating cube
>>>>>>> instance 'STREAMING_CUBE'
>>>>>>> 2016-09-20 20:31:51,538 WARN  [main StreamingCLI:127]: invalid
>>>>>>> args:streaming start STREAMING_CUBE 1474374540000_1474374600000 -start
>>>>>>> 1474374540000 -end 1474374600000 -cube STREAMING_CUBE
>>>>>>> 2016-09-20 20:31:51,539 ERROR [main StreamingCLI:103]: error start
>>>>>>> streaming
>>>>>>> java.lang.IllegalStateException: Segments overlap:
>>>>>>> STREAMING_CUBE[FULL_BUILD] and STREAMING_CUBE[FULL_BUILD]
>>>>>>> at org.apache.kylin.cube.CubeValidator.validate(CubeValidator.j
>>>>>>> ava:85)
>>>>>>> at org.apache.kylin.cube.CubeManager.updateCubeWithRetry(CubeMa
>>>>>>> nager.java:358)
>>>>>>> at org.apache.kylin.cube.CubeManager.updateCube(CubeManager.jav
>>>>>>> a:301)
>>>>>>> at org.apache.kylin.cube.CubeManager.appendSegment(CubeManager.
>>>>>>> java:441)
>>>>>>> at org.apache.kylin.engine.streaming.cube.StreamingCubeBuilder.
>>>>>>> createBuildable(StreamingCubeBuilder.java:118)
>>>>>>> at org.apache.kylin.engine.streaming.OneOffStreamingBuilder$1.r
>>>>>>> un(OneOffStreamingBuilder.java:76)
>>>>>>> at org.apache.kylin.engine.streaming.cli.StreamingCLI.startOneO
>>>>>>> ffCubeStreaming(StreamingCLI.java:123)
>>>>>>> at org.apache.kylin.engine.streaming.cli.StreamingCLI.main(Stre
>>>>>>> amingCLI.java:97)
>>>>>>> 2016-09-20 20:31:51,543 INFO  [Thread-0
>>>>>>> ConnectionManager$HConnectionImplementation:1678]: Closing
>>>>>>> zookeeper sessionid=0x35708fbc2740013
>>>>>>> 2016-09-20 20:31:51,549 INFO  [Thread-0 ZooKeeper:684]: Session:
>>>>>>> 0x35708fbc2740013 closed
>>>>>>> 2016-09-20 20:31:51,549 INFO  [main-EventThread ClientCnxn:512]:
>>>>>>> EventThread shut down
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best regards,
>>>>>
>>>>> Shaofeng Shi 史少锋
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Best regards,
>>>
>>> Shaofeng Shi 史少锋
>>>
>>>
>>
>
>
> --
> Best regards,
>
> Shaofeng Shi 史少锋
>
>

Re: Error while building cube from stream

Posted by ShaoFeng Shi <sh...@apache.org>.
Hi Tony,

You're correct; The global dictionary wasn't supported in stream builder
(this is the first reporting); Could you please open a JIRA?
https://issues.apache.org/jira/secure/Dashboard.jspa

BTW, we're developing the new version of streaming engine, which will reuse
most of the logic of batch cubing engine, planned to roll out in v1.6. I
believe with the new design there will have no such issue.

2016-09-26 14:56 GMT+08:00 Tony Lee <bt...@gmail.com>:

> Thanks
>
> But this does not work on streaming cube.
>
> I read some code and found that in class *StreamingCubeBuilder,* the
> dictionary map was built by *DictionaryGenerator.buildDictionary()*
> instead of *DictionaryManager.buildDictionary()*. Does this mean that
> streaming cube does not support global dictionary?
>
> I add USERID to the dimensions, then the cube was built successfully. But
> I think the result will be incorrect if I calculate count distinct in
> different segments. Is that right
>
>
> Tony
>
> On Sat, Sep 24, 2016 at 10:29 PM, ShaoFeng Shi <sh...@apache.org>
> wrote:
>
>> Hi Tony,
>>
>> The error was occurred when building a bitmap counter (for distinct
>> count); from your cube descriptor, it seems there is no global dictionary
>> be specified for the user id column. Please check this blog:
>> https://kylin.apache.org/blog/2016/08/01/count-distinct-in-kylin/
>>
>> 2016-09-22 10:49 GMT+08:00 Tony Lee <bt...@gmail.com>:
>>
>>> Thanks, ShaoFeng Shi. That is the reason.
>>>
>>> But unfortunately, I have a new problem about count distinct (precisely)
>>>
>>> I  added a streaming table on version 1.5.4 with my own json, which is
>>> like this
>>> {
>>>     "logTimestamp":1474456891127,
>>>     "datetime":"2016-09-21 19:21:31",
>>>     "uploadTime":"20160921192023",
>>>     "userId":"f2d28cbf9e21340a49e97063486db1f5",
>>>     "accountId":"84108490",
>>>     "otherfield":"...."
>>> }
>>>
>>> *The error message while building the cube is*
>>>
>>> 2016-09-22 10:01:40,731 ERROR [main StreamingCLI:103]: error start
>>> streaming
>>> java.lang.RuntimeException: error build cube from StreamingBatch
>>>         at org.apache.kylin.engine.streaming.cube.StreamingCubeBuilder.
>>> build(StreamingCubeBuilder.java:105)
>>>         at org.apache.kylin.engine.streaming.OneOffStreamingBuilder$1.r
>>> un(OneOffStreamingBuilder.java:79)
>>>         at org.apache.kylin.engine.streaming.cli.StreamingCLI.startOneO
>>> ffCubeStreaming(StreamingCLI.java:123)
>>>         at org.apache.kylin.engine.streaming.cli.StreamingCLI.main(Stre
>>> amingCLI.java:97)
>>> Caused by: java.lang.NullPointerException
>>>         at org.apache.kylin.measure.bitmap.BitmapMeasureType$1.valueOf(
>>> BitmapMeasureType.java:100)
>>>         at org.apache.kylin.measure.bitmap.BitmapMeasureType$1.valueOf(
>>> BitmapMeasureType.java:89)
>>>         at org.apache.kylin.cube.inmemcubing.InMemCubeBuilderInputConve
>>> rter.buildValueOf(InMemCubeBuilderInputConverter.java:122)
>>>         at org.apache.kylin.cube.inmemcubing.InMemCubeBuilderInputConve
>>> rter.buildValue(InMemCubeBuilderInputConverter.java:94)
>>>         at org.apache.kylin.cube.inmemcubing.InMemCubeBuilderInputConve
>>> rter.convert(InMemCubeBuilderInputConverter.java:70)
>>>         at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder$InputConv
>>> erter$1.next(InMemCubeBuilder.java:542)
>>>         at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder$InputConv
>>> erter$1.next(InMemCubeBuilder.java:523)
>>>         at org.apache.kylin.gridtable.GTAggregateScanner.iterator(GTAgg
>>> regateScanner.java:139)
>>>         at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder.createBas
>>> eCuboid(InMemCubeBuilder.java:339)
>>>         at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder.build(InM
>>> emCubeBuilder.java:166)
>>>         at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder.build(InM
>>> emCubeBuilder.java:135)
>>>         at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder.build(InM
>>> emCubeBuilder.java:122)
>>>         at org.apache.kylin.cube.inmemcubing.AbstractInMemCubeBuilder$1
>>> .run(AbstractInMemCubeBuilder.java:80)
>>>         at java.util.concurrent.Executors$RunnableAdapter.call(Executor
>>> s.java:471)
>>>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
>>> Executor.java:1145)
>>>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
>>> lExecutor.java:615)
>>>         at java.lang.Thread.run(Thread.java:745)
>>>
>>>
>>> *and the cube json is*
>>> {
>>>   "uuid": "db91bcea-b33f-48af-a2f5-6014b14031f4",
>>>   "last_modified": 1474511879506,
>>>   "version": "1.5.4",
>>>   "name": "hot_play_c",
>>>   "model_name": "hot_play_cube",
>>>   "description": "",
>>>   "null_string": null,
>>>   "dimensions": [
>>>     {
>>>       "name": "DEFAULT.HOT_PLAY.HOUR_START",
>>>       "table": "DEFAULT.HOT_PLAY",
>>>       "column": "HOUR_START",
>>>       "derived": null
>>>     },
>>>     {
>>>       "name": "DEFAULT.HOT_PLAY.MINUTE_START",
>>>       "table": "DEFAULT.HOT_PLAY",
>>>       "column": "MINUTE_START",
>>>       "derived": null
>>>     }
>>>   ],
>>>   "measures": [
>>>     {
>>>       "name": "_COUNT_",
>>>       "function": {
>>>         "expression": "COUNT",
>>>         "parameter": {
>>>           "type": "constant",
>>>           "value": "1",
>>>           "next_parameter": null
>>>         },
>>>         "returntype": "bigint"
>>>       },
>>>       "dependent_measure_ref": null
>>>     },
>>>     {
>>>       "name": "COUNT_DISTINCT_USER",
>>>       "function": {
>>>         "expression": "COUNT_DISTINCT",
>>>         "parameter": {
>>>           "type": "column",
>>>           "value": "USERID",
>>>           "next_parameter": null
>>>         },
>>>         "returntype": "bitmap"
>>>       },
>>>       "dependent_measure_ref": null
>>>     }
>>>   ],
>>>   "dictionaries": [],
>>>   "rowkey": {
>>>     "rowkey_columns": [
>>>       {
>>>         "column": "HOUR_START",
>>>         "encoding": "time",
>>>         "isShardBy": false
>>>       },
>>>       {
>>>         "column": "MINUTE_START",
>>>         "encoding": "time",
>>>         "isShardBy": false
>>>       }
>>>     ]
>>>   },
>>>   "hbase_mapping": {
>>>     "column_family": [
>>>       {
>>>         "name": "F1",
>>>         "columns": [
>>>           {
>>>             "qualifier": "M",
>>>             "measure_refs": [
>>>               "_COUNT_"
>>>             ]
>>>           }
>>>         ]
>>>       },
>>>       {
>>>         "name": "F2",
>>>         "columns": [
>>>           {
>>>             "qualifier": "M",
>>>             "measure_refs": [
>>>               "COUNT_DISTINCT_USER"
>>>             ]
>>>           }
>>>         ]
>>>       }
>>>     ]
>>>   },
>>>   "aggregation_groups": [
>>>     {
>>>       "includes": [
>>>         "HOUR_START",
>>>         "MINUTE_START"
>>>       ],
>>>       "select_rule": {
>>>         "hierarchy_dims": [],
>>>         "mandatory_dims": [],
>>>         "joint_dims": []
>>>       }
>>>     }
>>>   ],
>>>   "signature": "QXddyWCVVCYQcozxd4Zh2w==",
>>>   "notify_list": [],
>>>   "status_need_notify": [
>>>     "ERROR",
>>>     "DISCARDED",
>>>     "SUCCEED"
>>>   ],
>>>   "partition_date_start": 0,
>>>   "partition_date_end": 3153600000000,
>>>   "auto_merge_time_ranges": [
>>>     604800000,
>>>     2419200000
>>>   ],
>>>   "retention_range": 0,
>>>   "engine_type": 2,
>>>   "storage_type": 2,
>>>   "override_kylin_properties": {}
>>> }
>>>
>>> *no error after i change the returntype to hllc(16)*
>>>
>>> *i have struggled for several days. Any hints about this?*
>>>
>>> On Wed, Sep 21, 2016 at 10:47 PM, ShaoFeng Shi <sh...@apache.org>
>>> wrote:
>>>
>>>> Hi Tony,
>>>>
>>>> It seems your cube isn't partitioned (no partition date column
>>>> specified); please check or provide the cube JSON.
>>>>
>>>> 2016-09-21 0:30 GMT+08:00 Alberto Ramón <a....@gmail.com>:
>>>>
>>>>> I don't know but , can you check this change?: KYLIN-1744
>>>>> <https://issues.apache.org/jira/browse/KYLIN-1744> in V1.3
>>>>>
>>>>>
>>>>> 2016-09-20 14:50 GMT+02:00 Tony Lee <bt...@gmail.com>:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I was building cube from stream as the document(http://kylin.apache.o
>>>>>> rg/docs15/tutorial/cube_streaming.html
>>>>>>
>>>>>> ) says.
>>>>>>
>>>>>> I was using 1.5.3, and i encounter this error. Same error on 1.5.4.
>>>>>> Everything fine on 1.5.2.1.
>>>>>>
>>>>>> Any idea how to solve this?
>>>>>>
>>>>>>
>>>>>> 2016-09-20 20:31:51,520 INFO  [main KafkaStreamingInput:129]: finish
>>>>>> to get streaming batch, total message count:30
>>>>>> 2016-09-20 20:31:51,532 DEBUG [main CubeManager:855]: Reloaded new
>>>>>> cube: STREAMING_CUBE with reference beingCUBE[name=STREAMING_CUBE] having 1
>>>>>> segments:KYLIN_2822I1W3CX
>>>>>> 2016-09-20 20:31:51,536 INFO  [main CubeManager:314]: Updating cube
>>>>>> instance 'STREAMING_CUBE'
>>>>>> 2016-09-20 20:31:51,538 WARN  [main StreamingCLI:127]: invalid
>>>>>> args:streaming start STREAMING_CUBE 1474374540000_1474374600000 -start
>>>>>> 1474374540000 -end 1474374600000 -cube STREAMING_CUBE
>>>>>> 2016-09-20 20:31:51,539 ERROR [main StreamingCLI:103]: error start
>>>>>> streaming
>>>>>> java.lang.IllegalStateException: Segments overlap:
>>>>>> STREAMING_CUBE[FULL_BUILD] and STREAMING_CUBE[FULL_BUILD]
>>>>>> at org.apache.kylin.cube.CubeValidator.validate(CubeValidator.j
>>>>>> ava:85)
>>>>>> at org.apache.kylin.cube.CubeManager.updateCubeWithRetry(CubeMa
>>>>>> nager.java:358)
>>>>>> at org.apache.kylin.cube.CubeManager.updateCube(CubeManager.java:301)
>>>>>> at org.apache.kylin.cube.CubeManager.appendSegment(CubeManager.
>>>>>> java:441)
>>>>>> at org.apache.kylin.engine.streaming.cube.StreamingCubeBuilder.
>>>>>> createBuildable(StreamingCubeBuilder.java:118)
>>>>>> at org.apache.kylin.engine.streaming.OneOffStreamingBuilder$1.r
>>>>>> un(OneOffStreamingBuilder.java:76)
>>>>>> at org.apache.kylin.engine.streaming.cli.StreamingCLI.startOneO
>>>>>> ffCubeStreaming(StreamingCLI.java:123)
>>>>>> at org.apache.kylin.engine.streaming.cli.StreamingCLI.main(Stre
>>>>>> amingCLI.java:97)
>>>>>> 2016-09-20 20:31:51,543 INFO  [Thread-0 ConnectionManager$HConnectionImplementation:1678]:
>>>>>> Closing zookeeper sessionid=0x35708fbc2740013
>>>>>> 2016-09-20 20:31:51,549 INFO  [Thread-0 ZooKeeper:684]: Session:
>>>>>> 0x35708fbc2740013 closed
>>>>>> 2016-09-20 20:31:51,549 INFO  [main-EventThread ClientCnxn:512]:
>>>>>> EventThread shut down
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Best regards,
>>>>
>>>> Shaofeng Shi 史少锋
>>>>
>>>>
>>>
>>
>>
>> --
>> Best regards,
>>
>> Shaofeng Shi 史少锋
>>
>>
>


-- 
Best regards,

Shaofeng Shi 史少锋

Re: Error while building cube from stream

Posted by Tony Lee <bt...@gmail.com>.
Thanks

But this does not work on streaming cube.

I read some code and found that in class *StreamingCubeBuilder,* the
dictionary map was built by *DictionaryGenerator.buildDictionary()* instead
of *DictionaryManager.buildDictionary()*. Does this mean that streaming
cube does not support global dictionary?

I add USERID to the dimensions, then the cube was built successfully. But I
think the result will be incorrect if I calculate count distinct in
different segments. Is that right


Tony

On Sat, Sep 24, 2016 at 10:29 PM, ShaoFeng Shi <sh...@apache.org>
wrote:

> Hi Tony,
>
> The error was occurred when building a bitmap counter (for distinct
> count); from your cube descriptor, it seems there is no global dictionary
> be specified for the user id column. Please check this blog:
> https://kylin.apache.org/blog/2016/08/01/count-distinct-in-kylin/
>
> 2016-09-22 10:49 GMT+08:00 Tony Lee <bt...@gmail.com>:
>
>> Thanks, ShaoFeng Shi. That is the reason.
>>
>> But unfortunately, I have a new problem about count distinct (precisely)
>>
>> I  added a streaming table on version 1.5.4 with my own json, which is
>> like this
>> {
>>     "logTimestamp":1474456891127,
>>     "datetime":"2016-09-21 19:21:31",
>>     "uploadTime":"20160921192023",
>>     "userId":"f2d28cbf9e21340a49e97063486db1f5",
>>     "accountId":"84108490",
>>     "otherfield":"...."
>> }
>>
>> *The error message while building the cube is*
>>
>> 2016-09-22 10:01:40,731 ERROR [main StreamingCLI:103]: error start
>> streaming
>> java.lang.RuntimeException: error build cube from StreamingBatch
>>         at org.apache.kylin.engine.streaming.cube.StreamingCubeBuilder.
>> build(StreamingCubeBuilder.java:105)
>>         at org.apache.kylin.engine.streaming.OneOffStreamingBuilder$1.
>> run(OneOffStreamingBuilder.java:79)
>>         at org.apache.kylin.engine.streaming.cli.StreamingCLI.startOneO
>> ffCubeStreaming(StreamingCLI.java:123)
>>         at org.apache.kylin.engine.streaming.cli.StreamingCLI.main(
>> StreamingCLI.java:97)
>> Caused by: java.lang.NullPointerException
>>         at org.apache.kylin.measure.bitmap.BitmapMeasureType$1.valueOf(
>> BitmapMeasureType.java:100)
>>         at org.apache.kylin.measure.bitmap.BitmapMeasureType$1.valueOf(
>> BitmapMeasureType.java:89)
>>         at org.apache.kylin.cube.inmemcubing.InMemCubeBuilderInputConve
>> rter.buildValueOf(InMemCubeBuilderInputConverter.java:122)
>>         at org.apache.kylin.cube.inmemcubing.InMemCubeBuilderInputConve
>> rter.buildValue(InMemCubeBuilderInputConverter.java:94)
>>         at org.apache.kylin.cube.inmemcubing.InMemCubeBuilderInputConve
>> rter.convert(InMemCubeBuilderInputConverter.java:70)
>>         at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder$InputConv
>> erter$1.next(InMemCubeBuilder.java:542)
>>         at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder$InputConv
>> erter$1.next(InMemCubeBuilder.java:523)
>>         at org.apache.kylin.gridtable.GTAggregateScanner.iterator(GTAgg
>> regateScanner.java:139)
>>         at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder.createBas
>> eCuboid(InMemCubeBuilder.java:339)
>>         at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder.build(
>> InMemCubeBuilder.java:166)
>>         at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder.build(
>> InMemCubeBuilder.java:135)
>>         at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder.build(
>> InMemCubeBuilder.java:122)
>>         at org.apache.kylin.cube.inmemcubing.AbstractInMemCubeBuilder$
>> 1.run(AbstractInMemCubeBuilder.java:80)
>>         at java.util.concurrent.Executors$RunnableAdapter.call(
>> Executors.java:471)
>>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
>> Executor.java:1145)
>>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
>> lExecutor.java:615)
>>         at java.lang.Thread.run(Thread.java:745)
>>
>>
>> *and the cube json is*
>> {
>>   "uuid": "db91bcea-b33f-48af-a2f5-6014b14031f4",
>>   "last_modified": 1474511879506,
>>   "version": "1.5.4",
>>   "name": "hot_play_c",
>>   "model_name": "hot_play_cube",
>>   "description": "",
>>   "null_string": null,
>>   "dimensions": [
>>     {
>>       "name": "DEFAULT.HOT_PLAY.HOUR_START",
>>       "table": "DEFAULT.HOT_PLAY",
>>       "column": "HOUR_START",
>>       "derived": null
>>     },
>>     {
>>       "name": "DEFAULT.HOT_PLAY.MINUTE_START",
>>       "table": "DEFAULT.HOT_PLAY",
>>       "column": "MINUTE_START",
>>       "derived": null
>>     }
>>   ],
>>   "measures": [
>>     {
>>       "name": "_COUNT_",
>>       "function": {
>>         "expression": "COUNT",
>>         "parameter": {
>>           "type": "constant",
>>           "value": "1",
>>           "next_parameter": null
>>         },
>>         "returntype": "bigint"
>>       },
>>       "dependent_measure_ref": null
>>     },
>>     {
>>       "name": "COUNT_DISTINCT_USER",
>>       "function": {
>>         "expression": "COUNT_DISTINCT",
>>         "parameter": {
>>           "type": "column",
>>           "value": "USERID",
>>           "next_parameter": null
>>         },
>>         "returntype": "bitmap"
>>       },
>>       "dependent_measure_ref": null
>>     }
>>   ],
>>   "dictionaries": [],
>>   "rowkey": {
>>     "rowkey_columns": [
>>       {
>>         "column": "HOUR_START",
>>         "encoding": "time",
>>         "isShardBy": false
>>       },
>>       {
>>         "column": "MINUTE_START",
>>         "encoding": "time",
>>         "isShardBy": false
>>       }
>>     ]
>>   },
>>   "hbase_mapping": {
>>     "column_family": [
>>       {
>>         "name": "F1",
>>         "columns": [
>>           {
>>             "qualifier": "M",
>>             "measure_refs": [
>>               "_COUNT_"
>>             ]
>>           }
>>         ]
>>       },
>>       {
>>         "name": "F2",
>>         "columns": [
>>           {
>>             "qualifier": "M",
>>             "measure_refs": [
>>               "COUNT_DISTINCT_USER"
>>             ]
>>           }
>>         ]
>>       }
>>     ]
>>   },
>>   "aggregation_groups": [
>>     {
>>       "includes": [
>>         "HOUR_START",
>>         "MINUTE_START"
>>       ],
>>       "select_rule": {
>>         "hierarchy_dims": [],
>>         "mandatory_dims": [],
>>         "joint_dims": []
>>       }
>>     }
>>   ],
>>   "signature": "QXddyWCVVCYQcozxd4Zh2w==",
>>   "notify_list": [],
>>   "status_need_notify": [
>>     "ERROR",
>>     "DISCARDED",
>>     "SUCCEED"
>>   ],
>>   "partition_date_start": 0,
>>   "partition_date_end": 3153600000000,
>>   "auto_merge_time_ranges": [
>>     604800000,
>>     2419200000
>>   ],
>>   "retention_range": 0,
>>   "engine_type": 2,
>>   "storage_type": 2,
>>   "override_kylin_properties": {}
>> }
>>
>> *no error after i change the returntype to hllc(16)*
>>
>> *i have struggled for several days. Any hints about this?*
>>
>> On Wed, Sep 21, 2016 at 10:47 PM, ShaoFeng Shi <sh...@apache.org>
>> wrote:
>>
>>> Hi Tony,
>>>
>>> It seems your cube isn't partitioned (no partition date column
>>> specified); please check or provide the cube JSON.
>>>
>>> 2016-09-21 0:30 GMT+08:00 Alberto Ramón <a....@gmail.com>:
>>>
>>>> I don't know but , can you check this change?: KYLIN-1744
>>>> <https://issues.apache.org/jira/browse/KYLIN-1744> in V1.3
>>>>
>>>>
>>>> 2016-09-20 14:50 GMT+02:00 Tony Lee <bt...@gmail.com>:
>>>>
>>>>> Hi,
>>>>>
>>>>> I was building cube from stream as the document(http://kylin.apache.o
>>>>> rg/docs15/tutorial/cube_streaming.html
>>>>>
>>>>> ) says.
>>>>>
>>>>> I was using 1.5.3, and i encounter this error. Same error on 1.5.4.
>>>>> Everything fine on 1.5.2.1.
>>>>>
>>>>> Any idea how to solve this?
>>>>>
>>>>>
>>>>> 2016-09-20 20:31:51,520 INFO  [main KafkaStreamingInput:129]: finish
>>>>> to get streaming batch, total message count:30
>>>>> 2016-09-20 20:31:51,532 DEBUG [main CubeManager:855]: Reloaded new
>>>>> cube: STREAMING_CUBE with reference beingCUBE[name=STREAMING_CUBE] having 1
>>>>> segments:KYLIN_2822I1W3CX
>>>>> 2016-09-20 20:31:51,536 INFO  [main CubeManager:314]: Updating cube
>>>>> instance 'STREAMING_CUBE'
>>>>> 2016-09-20 20:31:51,538 WARN  [main StreamingCLI:127]: invalid
>>>>> args:streaming start STREAMING_CUBE 1474374540000_1474374600000 -start
>>>>> 1474374540000 -end 1474374600000 -cube STREAMING_CUBE
>>>>> 2016-09-20 20:31:51,539 ERROR [main StreamingCLI:103]: error start
>>>>> streaming
>>>>> java.lang.IllegalStateException: Segments overlap:
>>>>> STREAMING_CUBE[FULL_BUILD] and STREAMING_CUBE[FULL_BUILD]
>>>>> at org.apache.kylin.cube.CubeValidator.validate(CubeValidator.java:85)
>>>>> at org.apache.kylin.cube.CubeManager.updateCubeWithRetry(CubeMa
>>>>> nager.java:358)
>>>>> at org.apache.kylin.cube.CubeManager.updateCube(CubeManager.java:301)
>>>>> at org.apache.kylin.cube.CubeManager.appendSegment(CubeManager.
>>>>> java:441)
>>>>> at org.apache.kylin.engine.streaming.cube.StreamingCubeBuilder.
>>>>> createBuildable(StreamingCubeBuilder.java:118)
>>>>> at org.apache.kylin.engine.streaming.OneOffStreamingBuilder$1.r
>>>>> un(OneOffStreamingBuilder.java:76)
>>>>> at org.apache.kylin.engine.streaming.cli.StreamingCLI.startOneO
>>>>> ffCubeStreaming(StreamingCLI.java:123)
>>>>> at org.apache.kylin.engine.streaming.cli.StreamingCLI.main(Stre
>>>>> amingCLI.java:97)
>>>>> 2016-09-20 20:31:51,543 INFO  [Thread-0 ConnectionManager$HConnectionImplementation:1678]:
>>>>> Closing zookeeper sessionid=0x35708fbc2740013
>>>>> 2016-09-20 20:31:51,549 INFO  [Thread-0 ZooKeeper:684]: Session:
>>>>> 0x35708fbc2740013 closed
>>>>> 2016-09-20 20:31:51,549 INFO  [main-EventThread ClientCnxn:512]:
>>>>> EventThread shut down
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Best regards,
>>>
>>> Shaofeng Shi 史少锋
>>>
>>>
>>
>
>
> --
> Best regards,
>
> Shaofeng Shi 史少锋
>
>

Re: Error while building cube from stream

Posted by ShaoFeng Shi <sh...@apache.org>.
Hi Tony,

The error was occurred when building a bitmap counter (for distinct count);
from your cube descriptor, it seems there is no global dictionary be
specified for the user id column. Please check this blog:
https://kylin.apache.org/blog/2016/08/01/count-distinct-in-kylin/

2016-09-22 10:49 GMT+08:00 Tony Lee <bt...@gmail.com>:

> Thanks, ShaoFeng Shi. That is the reason.
>
> But unfortunately, I have a new problem about count distinct (precisely)
>
> I  added a streaming table on version 1.5.4 with my own json, which is
> like this
> {
>     "logTimestamp":1474456891127,
>     "datetime":"2016-09-21 19:21:31",
>     "uploadTime":"20160921192023",
>     "userId":"f2d28cbf9e21340a49e97063486db1f5",
>     "accountId":"84108490",
>     "otherfield":"...."
> }
>
> *The error message while building the cube is*
>
> 2016-09-22 10:01:40,731 ERROR [main StreamingCLI:103]: error start
> streaming
> java.lang.RuntimeException: error build cube from StreamingBatch
>         at org.apache.kylin.engine.streaming.cube.
> StreamingCubeBuilder.build(StreamingCubeBuilder.java:105)
>         at org.apache.kylin.engine.streaming.OneOffStreamingBuilder$1.run(
> OneOffStreamingBuilder.java:79)
>         at org.apache.kylin.engine.streaming.cli.StreamingCLI.
> startOneOffCubeStreaming(StreamingCLI.java:123)
>         at org.apache.kylin.engine.streaming.cli.StreamingCLI.
> main(StreamingCLI.java:97)
> Caused by: java.lang.NullPointerException
>         at org.apache.kylin.measure.bitmap.BitmapMeasureType$1.
> valueOf(BitmapMeasureType.java:100)
>         at org.apache.kylin.measure.bitmap.BitmapMeasureType$1.
> valueOf(BitmapMeasureType.java:89)
>         at org.apache.kylin.cube.inmemcubing.
> InMemCubeBuilderInputConverter.buildValueOf(InMemCubeBuilderInputConverter
> .java:122)
>         at org.apache.kylin.cube.inmemcubing.
> InMemCubeBuilderInputConverter.buildValue(InMemCubeBuilderInputConverter
> .java:94)
>         at org.apache.kylin.cube.inmemcubing.
> InMemCubeBuilderInputConverter.convert(InMemCubeBuilderInputConverter
> .java:70)
>         at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder$
> InputConverter$1.next(InMemCubeBuilder.java:542)
>         at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder$
> InputConverter$1.next(InMemCubeBuilder.java:523)
>         at org.apache.kylin.gridtable.GTAggregateScanner.iterator(
> GTAggregateScanner.java:139)
>         at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder.
> createBaseCuboid(InMemCubeBuilder.java:339)
>         at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder.
> build(InMemCubeBuilder.java:166)
>         at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder.
> build(InMemCubeBuilder.java:135)
>         at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder.
> build(InMemCubeBuilder.java:122)
>         at org.apache.kylin.cube.inmemcubing.AbstractInMemCubeBuilder$1.
> run(AbstractInMemCubeBuilder.java:80)
>         at java.util.concurrent.Executors$RunnableAdapter.
> call(Executors.java:471)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1145)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:745)
>
>
> *and the cube json is*
> {
>   "uuid": "db91bcea-b33f-48af-a2f5-6014b14031f4",
>   "last_modified": 1474511879506,
>   "version": "1.5.4",
>   "name": "hot_play_c",
>   "model_name": "hot_play_cube",
>   "description": "",
>   "null_string": null,
>   "dimensions": [
>     {
>       "name": "DEFAULT.HOT_PLAY.HOUR_START",
>       "table": "DEFAULT.HOT_PLAY",
>       "column": "HOUR_START",
>       "derived": null
>     },
>     {
>       "name": "DEFAULT.HOT_PLAY.MINUTE_START",
>       "table": "DEFAULT.HOT_PLAY",
>       "column": "MINUTE_START",
>       "derived": null
>     }
>   ],
>   "measures": [
>     {
>       "name": "_COUNT_",
>       "function": {
>         "expression": "COUNT",
>         "parameter": {
>           "type": "constant",
>           "value": "1",
>           "next_parameter": null
>         },
>         "returntype": "bigint"
>       },
>       "dependent_measure_ref": null
>     },
>     {
>       "name": "COUNT_DISTINCT_USER",
>       "function": {
>         "expression": "COUNT_DISTINCT",
>         "parameter": {
>           "type": "column",
>           "value": "USERID",
>           "next_parameter": null
>         },
>         "returntype": "bitmap"
>       },
>       "dependent_measure_ref": null
>     }
>   ],
>   "dictionaries": [],
>   "rowkey": {
>     "rowkey_columns": [
>       {
>         "column": "HOUR_START",
>         "encoding": "time",
>         "isShardBy": false
>       },
>       {
>         "column": "MINUTE_START",
>         "encoding": "time",
>         "isShardBy": false
>       }
>     ]
>   },
>   "hbase_mapping": {
>     "column_family": [
>       {
>         "name": "F1",
>         "columns": [
>           {
>             "qualifier": "M",
>             "measure_refs": [
>               "_COUNT_"
>             ]
>           }
>         ]
>       },
>       {
>         "name": "F2",
>         "columns": [
>           {
>             "qualifier": "M",
>             "measure_refs": [
>               "COUNT_DISTINCT_USER"
>             ]
>           }
>         ]
>       }
>     ]
>   },
>   "aggregation_groups": [
>     {
>       "includes": [
>         "HOUR_START",
>         "MINUTE_START"
>       ],
>       "select_rule": {
>         "hierarchy_dims": [],
>         "mandatory_dims": [],
>         "joint_dims": []
>       }
>     }
>   ],
>   "signature": "QXddyWCVVCYQcozxd4Zh2w==",
>   "notify_list": [],
>   "status_need_notify": [
>     "ERROR",
>     "DISCARDED",
>     "SUCCEED"
>   ],
>   "partition_date_start": 0,
>   "partition_date_end": 3153600000000,
>   "auto_merge_time_ranges": [
>     604800000,
>     2419200000
>   ],
>   "retention_range": 0,
>   "engine_type": 2,
>   "storage_type": 2,
>   "override_kylin_properties": {}
> }
>
> *no error after i change the returntype to hllc(16)*
>
> *i have struggled for several days. Any hints about this?*
>
> On Wed, Sep 21, 2016 at 10:47 PM, ShaoFeng Shi <sh...@apache.org>
> wrote:
>
>> Hi Tony,
>>
>> It seems your cube isn't partitioned (no partition date column
>> specified); please check or provide the cube JSON.
>>
>> 2016-09-21 0:30 GMT+08:00 Alberto Ramón <a....@gmail.com>:
>>
>>> I don't know but , can you check this change?: KYLIN-1744
>>> <https://issues.apache.org/jira/browse/KYLIN-1744> in V1.3
>>>
>>>
>>> 2016-09-20 14:50 GMT+02:00 Tony Lee <bt...@gmail.com>:
>>>
>>>> Hi,
>>>>
>>>> I was building cube from stream as the document(http://kylin.apache.o
>>>> rg/docs15/tutorial/cube_streaming.html
>>>>
>>>> ) says.
>>>>
>>>> I was using 1.5.3, and i encounter this error. Same error on 1.5.4.
>>>> Everything fine on 1.5.2.1.
>>>>
>>>> Any idea how to solve this?
>>>>
>>>>
>>>> 2016-09-20 20:31:51,520 INFO  [main KafkaStreamingInput:129]: finish to
>>>> get streaming batch, total message count:30
>>>> 2016-09-20 20:31:51,532 DEBUG [main CubeManager:855]: Reloaded new
>>>> cube: STREAMING_CUBE with reference beingCUBE[name=STREAMING_CUBE] having 1
>>>> segments:KYLIN_2822I1W3CX
>>>> 2016-09-20 20:31:51,536 INFO  [main CubeManager:314]: Updating cube
>>>> instance 'STREAMING_CUBE'
>>>> 2016-09-20 20:31:51,538 WARN  [main StreamingCLI:127]: invalid
>>>> args:streaming start STREAMING_CUBE 1474374540000_1474374600000 -start
>>>> 1474374540000 -end 1474374600000 -cube STREAMING_CUBE
>>>> 2016-09-20 20:31:51,539 ERROR [main StreamingCLI:103]: error start
>>>> streaming
>>>> java.lang.IllegalStateException: Segments overlap:
>>>> STREAMING_CUBE[FULL_BUILD] and STREAMING_CUBE[FULL_BUILD]
>>>> at org.apache.kylin.cube.CubeValidator.validate(CubeValidator.java:85)
>>>> at org.apache.kylin.cube.CubeManager.updateCubeWithRetry(CubeMa
>>>> nager.java:358)
>>>> at org.apache.kylin.cube.CubeManager.updateCube(CubeManager.java:301)
>>>> at org.apache.kylin.cube.CubeManager.appendSegment(CubeManager.
>>>> java:441)
>>>> at org.apache.kylin.engine.streaming.cube.StreamingCubeBuilder.
>>>> createBuildable(StreamingCubeBuilder.java:118)
>>>> at org.apache.kylin.engine.streaming.OneOffStreamingBuilder$1.r
>>>> un(OneOffStreamingBuilder.java:76)
>>>> at org.apache.kylin.engine.streaming.cli.StreamingCLI.startOneO
>>>> ffCubeStreaming(StreamingCLI.java:123)
>>>> at org.apache.kylin.engine.streaming.cli.StreamingCLI.main(Stre
>>>> amingCLI.java:97)
>>>> 2016-09-20 20:31:51,543 INFO  [Thread-0 ConnectionManager$HConnectionImplementation:1678]:
>>>> Closing zookeeper sessionid=0x35708fbc2740013
>>>> 2016-09-20 20:31:51,549 INFO  [Thread-0 ZooKeeper:684]: Session:
>>>> 0x35708fbc2740013 closed
>>>> 2016-09-20 20:31:51,549 INFO  [main-EventThread ClientCnxn:512]:
>>>> EventThread shut down
>>>>
>>>>
>>>
>>
>>
>> --
>> Best regards,
>>
>> Shaofeng Shi 史少锋
>>
>>
>


-- 
Best regards,

Shaofeng Shi 史少锋

Re: Error while building cube from stream

Posted by Tony Lee <bt...@gmail.com>.
Thanks, ShaoFeng Shi. That is the reason.

But unfortunately, I have a new problem about count distinct (precisely)

I  added a streaming table on version 1.5.4 with my own json, which is like
this
{
    "logTimestamp":1474456891127,
    "datetime":"2016-09-21 19:21:31",
    "uploadTime":"20160921192023",
    "userId":"f2d28cbf9e21340a49e97063486db1f5",
    "accountId":"84108490",
    "otherfield":"...."
}

*The error message while building the cube is*

2016-09-22 10:01:40,731 ERROR [main StreamingCLI:103]: error start streaming
java.lang.RuntimeException: error build cube from StreamingBatch
        at
org.apache.kylin.engine.streaming.cube.StreamingCubeBuilder.build(StreamingCubeBuilder.java:105)
        at
org.apache.kylin.engine.streaming.OneOffStreamingBuilder$1.run(OneOffStreamingBuilder.java:79)
        at
org.apache.kylin.engine.streaming.cli.StreamingCLI.startOneOffCubeStreaming(StreamingCLI.java:123)
        at
org.apache.kylin.engine.streaming.cli.StreamingCLI.main(StreamingCLI.java:97)
Caused by: java.lang.NullPointerException
        at
org.apache.kylin.measure.bitmap.BitmapMeasureType$1.valueOf(BitmapMeasureType.java:100)
        at
org.apache.kylin.measure.bitmap.BitmapMeasureType$1.valueOf(BitmapMeasureType.java:89)
        at
org.apache.kylin.cube.inmemcubing.InMemCubeBuilderInputConverter.buildValueOf(InMemCubeBuilderInputConverter.java:122)
        at
org.apache.kylin.cube.inmemcubing.InMemCubeBuilderInputConverter.buildValue(InMemCubeBuilderInputConverter.java:94)
        at
org.apache.kylin.cube.inmemcubing.InMemCubeBuilderInputConverter.convert(InMemCubeBuilderInputConverter.java:70)
        at
org.apache.kylin.cube.inmemcubing.InMemCubeBuilder$InputConverter$1.next(InMemCubeBuilder.java:542)
        at
org.apache.kylin.cube.inmemcubing.InMemCubeBuilder$InputConverter$1.next(InMemCubeBuilder.java:523)
        at
org.apache.kylin.gridtable.GTAggregateScanner.iterator(GTAggregateScanner.java:139)
        at
org.apache.kylin.cube.inmemcubing.InMemCubeBuilder.createBaseCuboid(InMemCubeBuilder.java:339)
        at
org.apache.kylin.cube.inmemcubing.InMemCubeBuilder.build(InMemCubeBuilder.java:166)
        at
org.apache.kylin.cube.inmemcubing.InMemCubeBuilder.build(InMemCubeBuilder.java:135)
        at
org.apache.kylin.cube.inmemcubing.InMemCubeBuilder.build(InMemCubeBuilder.java:122)
        at
org.apache.kylin.cube.inmemcubing.AbstractInMemCubeBuilder$1.run(AbstractInMemCubeBuilder.java:80)
        at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)


*and the cube json is*
{
  "uuid": "db91bcea-b33f-48af-a2f5-6014b14031f4",
  "last_modified": 1474511879506,
  "version": "1.5.4",
  "name": "hot_play_c",
  "model_name": "hot_play_cube",
  "description": "",
  "null_string": null,
  "dimensions": [
    {
      "name": "DEFAULT.HOT_PLAY.HOUR_START",
      "table": "DEFAULT.HOT_PLAY",
      "column": "HOUR_START",
      "derived": null
    },
    {
      "name": "DEFAULT.HOT_PLAY.MINUTE_START",
      "table": "DEFAULT.HOT_PLAY",
      "column": "MINUTE_START",
      "derived": null
    }
  ],
  "measures": [
    {
      "name": "_COUNT_",
      "function": {
        "expression": "COUNT",
        "parameter": {
          "type": "constant",
          "value": "1",
          "next_parameter": null
        },
        "returntype": "bigint"
      },
      "dependent_measure_ref": null
    },
    {
      "name": "COUNT_DISTINCT_USER",
      "function": {
        "expression": "COUNT_DISTINCT",
        "parameter": {
          "type": "column",
          "value": "USERID",
          "next_parameter": null
        },
        "returntype": "bitmap"
      },
      "dependent_measure_ref": null
    }
  ],
  "dictionaries": [],
  "rowkey": {
    "rowkey_columns": [
      {
        "column": "HOUR_START",
        "encoding": "time",
        "isShardBy": false
      },
      {
        "column": "MINUTE_START",
        "encoding": "time",
        "isShardBy": false
      }
    ]
  },
  "hbase_mapping": {
    "column_family": [
      {
        "name": "F1",
        "columns": [
          {
            "qualifier": "M",
            "measure_refs": [
              "_COUNT_"
            ]
          }
        ]
      },
      {
        "name": "F2",
        "columns": [
          {
            "qualifier": "M",
            "measure_refs": [
              "COUNT_DISTINCT_USER"
            ]
          }
        ]
      }
    ]
  },
  "aggregation_groups": [
    {
      "includes": [
        "HOUR_START",
        "MINUTE_START"
      ],
      "select_rule": {
        "hierarchy_dims": [],
        "mandatory_dims": [],
        "joint_dims": []
      }
    }
  ],
  "signature": "QXddyWCVVCYQcozxd4Zh2w==",
  "notify_list": [],
  "status_need_notify": [
    "ERROR",
    "DISCARDED",
    "SUCCEED"
  ],
  "partition_date_start": 0,
  "partition_date_end": 3153600000000,
  "auto_merge_time_ranges": [
    604800000,
    2419200000
  ],
  "retention_range": 0,
  "engine_type": 2,
  "storage_type": 2,
  "override_kylin_properties": {}
}

*no error after i change the returntype to hllc(16)*

*i have struggled for several days. Any hints about this?*

On Wed, Sep 21, 2016 at 10:47 PM, ShaoFeng Shi <sh...@apache.org>
wrote:

> Hi Tony,
>
> It seems your cube isn't partitioned (no partition date column specified);
> please check or provide the cube JSON.
>
> 2016-09-21 0:30 GMT+08:00 Alberto Ramón <a....@gmail.com>:
>
>> I don't know but , can you check this change?: KYLIN-1744
>> <https://issues.apache.org/jira/browse/KYLIN-1744> in V1.3
>>
>>
>> 2016-09-20 14:50 GMT+02:00 Tony Lee <bt...@gmail.com>:
>>
>>> Hi,
>>>
>>> I was building cube from stream as the document(http://kylin.apache.o
>>> rg/docs15/tutorial/cube_streaming.html
>>>
>>> ) says.
>>>
>>> I was using 1.5.3, and i encounter this error. Same error on 1.5.4.
>>> Everything fine on 1.5.2.1.
>>>
>>> Any idea how to solve this?
>>>
>>>
>>> 2016-09-20 20:31:51,520 INFO  [main KafkaStreamingInput:129]: finish to
>>> get streaming batch, total message count:30
>>> 2016-09-20 20:31:51,532 DEBUG [main CubeManager:855]: Reloaded new cube:
>>> STREAMING_CUBE with reference beingCUBE[name=STREAMING_CUBE] having 1
>>> segments:KYLIN_2822I1W3CX
>>> 2016-09-20 20:31:51,536 INFO  [main CubeManager:314]: Updating cube
>>> instance 'STREAMING_CUBE'
>>> 2016-09-20 20:31:51,538 WARN  [main StreamingCLI:127]: invalid
>>> args:streaming start STREAMING_CUBE 1474374540000_1474374600000 -start
>>> 1474374540000 -end 1474374600000 -cube STREAMING_CUBE
>>> 2016-09-20 20:31:51,539 ERROR [main StreamingCLI:103]: error start
>>> streaming
>>> java.lang.IllegalStateException: Segments overlap:
>>> STREAMING_CUBE[FULL_BUILD] and STREAMING_CUBE[FULL_BUILD]
>>> at org.apache.kylin.cube.CubeValidator.validate(CubeValidator.java:85)
>>> at org.apache.kylin.cube.CubeManager.updateCubeWithRetry(CubeMa
>>> nager.java:358)
>>> at org.apache.kylin.cube.CubeManager.updateCube(CubeManager.java:301)
>>> at org.apache.kylin.cube.CubeManager.appendSegment(CubeManager.java:441)
>>> at org.apache.kylin.engine.streaming.cube.StreamingCubeBuilder.
>>> createBuildable(StreamingCubeBuilder.java:118)
>>> at org.apache.kylin.engine.streaming.OneOffStreamingBuilder$1.r
>>> un(OneOffStreamingBuilder.java:76)
>>> at org.apache.kylin.engine.streaming.cli.StreamingCLI.startOneO
>>> ffCubeStreaming(StreamingCLI.java:123)
>>> at org.apache.kylin.engine.streaming.cli.StreamingCLI.main(Stre
>>> amingCLI.java:97)
>>> 2016-09-20 20:31:51,543 INFO  [Thread-0 ConnectionManager$HConnectionImplementation:1678]:
>>> Closing zookeeper sessionid=0x35708fbc2740013
>>> 2016-09-20 20:31:51,549 INFO  [Thread-0 ZooKeeper:684]: Session:
>>> 0x35708fbc2740013 closed
>>> 2016-09-20 20:31:51,549 INFO  [main-EventThread ClientCnxn:512]:
>>> EventThread shut down
>>>
>>>
>>
>
>
> --
> Best regards,
>
> Shaofeng Shi 史少锋
>
>

Re: Error while building cube from stream

Posted by ShaoFeng Shi <sh...@apache.org>.
Hi Tony,

It seems your cube isn't partitioned (no partition date column specified);
please check or provide the cube JSON.

2016-09-21 0:30 GMT+08:00 Alberto Ramón <a....@gmail.com>:

> I don't know but , can you check this change?: KYLIN-1744
> <https://issues.apache.org/jira/browse/KYLIN-1744> in V1.3
>
>
> 2016-09-20 14:50 GMT+02:00 Tony Lee <bt...@gmail.com>:
>
>> Hi,
>>
>> I was building cube from stream as the document(http://kylin.apache.o
>> rg/docs15/tutorial/cube_streaming.html
>>
>> ) says.
>>
>> I was using 1.5.3, and i encounter this error. Same error on 1.5.4.
>> Everything fine on 1.5.2.1.
>>
>> Any idea how to solve this?
>>
>>
>> 2016-09-20 20:31:51,520 INFO  [main KafkaStreamingInput:129]: finish to
>> get streaming batch, total message count:30
>> 2016-09-20 20:31:51,532 DEBUG [main CubeManager:855]: Reloaded new cube:
>> STREAMING_CUBE with reference beingCUBE[name=STREAMING_CUBE] having 1
>> segments:KYLIN_2822I1W3CX
>> 2016-09-20 20:31:51,536 INFO  [main CubeManager:314]: Updating cube
>> instance 'STREAMING_CUBE'
>> 2016-09-20 20:31:51,538 WARN  [main StreamingCLI:127]: invalid
>> args:streaming start STREAMING_CUBE 1474374540000_1474374600000 -start
>> 1474374540000 -end 1474374600000 -cube STREAMING_CUBE
>> 2016-09-20 20:31:51,539 ERROR [main StreamingCLI:103]: error start
>> streaming
>> java.lang.IllegalStateException: Segments overlap:
>> STREAMING_CUBE[FULL_BUILD] and STREAMING_CUBE[FULL_BUILD]
>> at org.apache.kylin.cube.CubeValidator.validate(CubeValidator.java:85)
>> at org.apache.kylin.cube.CubeManager.updateCubeWithRetry(CubeMa
>> nager.java:358)
>> at org.apache.kylin.cube.CubeManager.updateCube(CubeManager.java:301)
>> at org.apache.kylin.cube.CubeManager.appendSegment(CubeManager.java:441)
>> at org.apache.kylin.engine.streaming.cube.StreamingCubeBuilder.
>> createBuildable(StreamingCubeBuilder.java:118)
>> at org.apache.kylin.engine.streaming.OneOffStreamingBuilder$1.
>> run(OneOffStreamingBuilder.java:76)
>> at org.apache.kylin.engine.streaming.cli.StreamingCLI.startOneO
>> ffCubeStreaming(StreamingCLI.java:123)
>> at org.apache.kylin.engine.streaming.cli.StreamingCLI.main(
>> StreamingCLI.java:97)
>> 2016-09-20 20:31:51,543 INFO  [Thread-0 ConnectionManager$HConnectionImplementation:1678]:
>> Closing zookeeper sessionid=0x35708fbc2740013
>> 2016-09-20 20:31:51,549 INFO  [Thread-0 ZooKeeper:684]: Session:
>> 0x35708fbc2740013 closed
>> 2016-09-20 20:31:51,549 INFO  [main-EventThread ClientCnxn:512]:
>> EventThread shut down
>>
>>
>


-- 
Best regards,

Shaofeng Shi 史少锋

Re: Error while building cube from stream

Posted by Alberto Ramón <a....@gmail.com>.
I don't know but , can you check this change?: KYLIN-1744
<https://issues.apache.org/jira/browse/KYLIN-1744> in V1.3


2016-09-20 14:50 GMT+02:00 Tony Lee <bt...@gmail.com>:

> Hi,
>
> I was building cube from stream as the document(http://kylin.apache.
> org/docs15/tutorial/cube_streaming.html
>
> ) says.
>
> I was using 1.5.3, and i encounter this error. Same error on 1.5.4.
> Everything fine on 1.5.2.1.
>
> Any idea how to solve this?
>
>
> 2016-09-20 20:31:51,520 INFO  [main KafkaStreamingInput:129]: finish to
> get streaming batch, total message count:30
> 2016-09-20 20:31:51,532 DEBUG [main CubeManager:855]: Reloaded new cube:
> STREAMING_CUBE with reference beingCUBE[name=STREAMING_CUBE] having 1
> segments:KYLIN_2822I1W3CX
> 2016-09-20 20:31:51,536 INFO  [main CubeManager:314]: Updating cube
> instance 'STREAMING_CUBE'
> 2016-09-20 20:31:51,538 WARN  [main StreamingCLI:127]: invalid
> args:streaming start STREAMING_CUBE 1474374540000_1474374600000 -start
> 1474374540000 -end 1474374600000 -cube STREAMING_CUBE
> 2016-09-20 20:31:51,539 ERROR [main StreamingCLI:103]: error start
> streaming
> java.lang.IllegalStateException: Segments overlap:
> STREAMING_CUBE[FULL_BUILD] and STREAMING_CUBE[FULL_BUILD]
> at org.apache.kylin.cube.CubeValidator.validate(CubeValidator.java:85)
> at org.apache.kylin.cube.CubeManager.updateCubeWithRetry(
> CubeManager.java:358)
> at org.apache.kylin.cube.CubeManager.updateCube(CubeManager.java:301)
> at org.apache.kylin.cube.CubeManager.appendSegment(CubeManager.java:441)
> at org.apache.kylin.engine.streaming.cube.StreamingCubeBuilder.
> createBuildable(StreamingCubeBuilder.java:118)
> at org.apache.kylin.engine.streaming.OneOffStreamingBuilder$1.run(
> OneOffStreamingBuilder.java:76)
> at org.apache.kylin.engine.streaming.cli.StreamingCLI.
> startOneOffCubeStreaming(StreamingCLI.java:123)
> at org.apache.kylin.engine.streaming.cli.StreamingCLI.
> main(StreamingCLI.java:97)
> 2016-09-20 20:31:51,543 INFO  [Thread-0 ConnectionManager$
> HConnectionImplementation:1678]: Closing zookeeper
> sessionid=0x35708fbc2740013
> 2016-09-20 20:31:51,549 INFO  [Thread-0 ZooKeeper:684]: Session:
> 0x35708fbc2740013 closed
> 2016-09-20 20:31:51,549 INFO  [main-EventThread ClientCnxn:512]:
> EventThread shut down
>
>