You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kylin.apache.org by dong wang <el...@gmail.com> on 2015/03/09 07:25:10 UTC

error when merging the cube

building Kylin from source: https://github.com/KylinOLAP/Kylin, when
selecting 2 segments of the cube to merge, the following error occur:

[pool-7-thread-3]:[2015-03-09
14:22:01,609][ERROR][org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:134)]
- ExecuteException job:059d0d3e-fa69-4ef5-b06b-f5625c1599d9
org.apache.kylin.job.exception.ExecuteException:
org.apache.kylin.job.exception.ExecuteException:
java.lang.ArrayIndexOutOfBoundsException
        at
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:102)
        at
org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:132)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.kylin.job.exception.ExecuteException:
java.lang.ArrayIndexOutOfBoundsException
        at
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:102)
        at
org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:50)
        at
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:99)
        ... 4 more
Caused by: java.lang.ArrayIndexOutOfBoundsException
        at
org.apache.kylin.dict.DateStrDictionary.getValueBytesFromIdImpl(DateStrDictionary.java:191)
        at
org.apache.kylin.dict.Dictionary.getValueBytesFromId(Dictionary.java:156)
        at
org.apache.kylin.dict.DictionaryGenerator.mergeDictionaries(DictionaryGenerator.java:94)
        at
org.apache.kylin.dict.DictionaryManager.mergeDictionary(DictionaryManager.java:149)
        at
org.apache.kylin.job.cube.MergeDictionaryStep.mergeDictionaries(MergeDictionaryStep.java:141)
        at
org.apache.kylin.job.cube.MergeDictionaryStep.makeDictForNewSegment(MergeDictionaryStep.java:131)
        at
org.apache.kylin.job.cube.MergeDictionaryStep.doWork(MergeDictionaryStep.java:68)
        at
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:99)

Re: error when merging the cube

Posted by Li Yang <li...@apache.org>.
Glad the problem is gone! Let's keep an eye on similar issues.

On Fri, Mar 27, 2015 at 4:47 PM, dong wang <el...@gmail.com> wrote:

> Yang, I'm not sure why the issue happens, however, it can works OK with the
> current latest source code from
> https://github.com/KylinOLAP/Kylin/tree/staging, and as checked the data
> in
> the log, it seems to be correct as well, the only difference I can remember
> is that before I used "date" type as partition column while "YYYY-mm-dd"
> string type as current partition column, since I have no enough time to try
> to reproduce the issue, we can skip this issue first until it occurs again
> in the future if it is indeeded an issue.
>
> 2015-03-22 9:58 GMT+08:00 Li Yang <li...@apache.org>:
>
> > Great you find the distinct values!  There must be some hidden issues in
> > the file or otherwise this critical problem should had haunted us for a
> > long time.
> >
> > Could you send a copy of the file? I can debug through it.
> >
> >
> > On Fri, Mar 20, 2015 at 2:05 PM, dong wang <el...@gmail.com>
> wrote:
> >
> > > Yang, currently, I have no enough environment to debug the issue, I
> will
> > > try to look into it next week~ however, when I check the biggest
> segment
> > of
> > > the cube with the following command:
> > >
> > > sudo -uhdfs hadoop fs -cat
> > >
> > >
> >
> /tmp/kylin-3c3159c6-012f-497d-826a-65dc9926442e/test/fact_distinct_columns/mydate
> > > | sort -nr -k 1 | wc -l
> > >
> > > it returns 517, which means the number of distinct days,  and of
> course,
> > as
> > > confirmed, the content for mydate is very normal and regular like:
> > > 2015-03-01
> > > 2015-02-28
> > > 2015-02-27
> > > 2015-02-26
> > > 2015-02-25
> > > 2015-02-24
> > > 2015-02-23
> > > 2015-02-22
> > > 2015-02-21
> > > 2015-02-20
> > > 2015-02-19
> > > 2015-02-18
> > > 2015-02-17
> > > 2015-02-16
> > > 2015-02-15
> > > 2015-02-14
> > > 2015-02-13
> > >
> > >
> > > 2015-03-20 10:38 GMT+08:00 Li Yang <li...@apache.org>:
> > >
> > > > Hi Dong, id 3652427 is illegal. After my fix, the biggest date is
> > > > 9999-12-31. Hope your analysis won't go beyond that point of time.
> :-)
> > > >
> > > > For your merge problem, you still need to dig why your data generate
> > the
> > > > illegal ID. You can look at DateStrDictionaryTest.java for details of
> > > > what's supported and what's not.
> > > >
> > > > Once data is fixed, refresh impacted segment so dictionary is
> refreshed
> > > to
> > > > correct state, then merge will be able to work.
> > > >
> > > > Cheers
> > > > Yang
> > > >
> > > > On Tue, Mar 17, 2015 at 10:09 AM, dong wang <el...@gmail.com>
> > > > wrote:
> > > >
> > > > > Yang, another thing is that I'm not sure whether the value of
> > > > > "id"(=3652427) is correct or not, if it is legal, then there may be
> > > some
> > > > > problems about the function mentioned above, if illegal, there
> should
> > > be
> > > > > some problems inside logic of generating id?
> > > > >
> > > >
> > >
> >
>

Re: error when merging the cube

Posted by dong wang <el...@gmail.com>.
Yang, I'm not sure why the issue happens, however, it can works OK with the
current latest source code from
https://github.com/KylinOLAP/Kylin/tree/staging, and as checked the data in
the log, it seems to be correct as well, the only difference I can remember
is that before I used "date" type as partition column while "YYYY-mm-dd"
string type as current partition column, since I have no enough time to try
to reproduce the issue, we can skip this issue first until it occurs again
in the future if it is indeeded an issue.

2015-03-22 9:58 GMT+08:00 Li Yang <li...@apache.org>:

> Great you find the distinct values!  There must be some hidden issues in
> the file or otherwise this critical problem should had haunted us for a
> long time.
>
> Could you send a copy of the file? I can debug through it.
>
>
> On Fri, Mar 20, 2015 at 2:05 PM, dong wang <el...@gmail.com> wrote:
>
> > Yang, currently, I have no enough environment to debug the issue, I will
> > try to look into it next week~ however, when I check the biggest segment
> of
> > the cube with the following command:
> >
> > sudo -uhdfs hadoop fs -cat
> >
> >
> /tmp/kylin-3c3159c6-012f-497d-826a-65dc9926442e/test/fact_distinct_columns/mydate
> > | sort -nr -k 1 | wc -l
> >
> > it returns 517, which means the number of distinct days,  and of course,
> as
> > confirmed, the content for mydate is very normal and regular like:
> > 2015-03-01
> > 2015-02-28
> > 2015-02-27
> > 2015-02-26
> > 2015-02-25
> > 2015-02-24
> > 2015-02-23
> > 2015-02-22
> > 2015-02-21
> > 2015-02-20
> > 2015-02-19
> > 2015-02-18
> > 2015-02-17
> > 2015-02-16
> > 2015-02-15
> > 2015-02-14
> > 2015-02-13
> >
> >
> > 2015-03-20 10:38 GMT+08:00 Li Yang <li...@apache.org>:
> >
> > > Hi Dong, id 3652427 is illegal. After my fix, the biggest date is
> > > 9999-12-31. Hope your analysis won't go beyond that point of time. :-)
> > >
> > > For your merge problem, you still need to dig why your data generate
> the
> > > illegal ID. You can look at DateStrDictionaryTest.java for details of
> > > what's supported and what's not.
> > >
> > > Once data is fixed, refresh impacted segment so dictionary is refreshed
> > to
> > > correct state, then merge will be able to work.
> > >
> > > Cheers
> > > Yang
> > >
> > > On Tue, Mar 17, 2015 at 10:09 AM, dong wang <el...@gmail.com>
> > > wrote:
> > >
> > > > Yang, another thing is that I'm not sure whether the value of
> > > > "id"(=3652427) is correct or not, if it is legal, then there may be
> > some
> > > > problems about the function mentioned above, if illegal, there should
> > be
> > > > some problems inside logic of generating id?
> > > >
> > >
> >
>

Re: error when merging the cube

Posted by Li Yang <li...@apache.org>.
Great you find the distinct values!  There must be some hidden issues in
the file or otherwise this critical problem should had haunted us for a
long time.

Could you send a copy of the file? I can debug through it.


On Fri, Mar 20, 2015 at 2:05 PM, dong wang <el...@gmail.com> wrote:

> Yang, currently, I have no enough environment to debug the issue, I will
> try to look into it next week~ however, when I check the biggest segment of
> the cube with the following command:
>
> sudo -uhdfs hadoop fs -cat
>
> /tmp/kylin-3c3159c6-012f-497d-826a-65dc9926442e/test/fact_distinct_columns/mydate
> | sort -nr -k 1 | wc -l
>
> it returns 517, which means the number of distinct days,  and of course, as
> confirmed, the content for mydate is very normal and regular like:
> 2015-03-01
> 2015-02-28
> 2015-02-27
> 2015-02-26
> 2015-02-25
> 2015-02-24
> 2015-02-23
> 2015-02-22
> 2015-02-21
> 2015-02-20
> 2015-02-19
> 2015-02-18
> 2015-02-17
> 2015-02-16
> 2015-02-15
> 2015-02-14
> 2015-02-13
>
>
> 2015-03-20 10:38 GMT+08:00 Li Yang <li...@apache.org>:
>
> > Hi Dong, id 3652427 is illegal. After my fix, the biggest date is
> > 9999-12-31. Hope your analysis won't go beyond that point of time. :-)
> >
> > For your merge problem, you still need to dig why your data generate the
> > illegal ID. You can look at DateStrDictionaryTest.java for details of
> > what's supported and what's not.
> >
> > Once data is fixed, refresh impacted segment so dictionary is refreshed
> to
> > correct state, then merge will be able to work.
> >
> > Cheers
> > Yang
> >
> > On Tue, Mar 17, 2015 at 10:09 AM, dong wang <el...@gmail.com>
> > wrote:
> >
> > > Yang, another thing is that I'm not sure whether the value of
> > > "id"(=3652427) is correct or not, if it is legal, then there may be
> some
> > > problems about the function mentioned above, if illegal, there should
> be
> > > some problems inside logic of generating id?
> > >
> >
>

Re: error when merging the cube

Posted by dong wang <el...@gmail.com>.
Yang, currently, I have no enough environment to debug the issue, I will
try to look into it next week~ however, when I check the biggest segment of
the cube with the following command:

sudo -uhdfs hadoop fs -cat
/tmp/kylin-3c3159c6-012f-497d-826a-65dc9926442e/test/fact_distinct_columns/mydate
| sort -nr -k 1 | wc -l

it returns 517, which means the number of distinct days,  and of course, as
confirmed, the content for mydate is very normal and regular like:
2015-03-01
2015-02-28
2015-02-27
2015-02-26
2015-02-25
2015-02-24
2015-02-23
2015-02-22
2015-02-21
2015-02-20
2015-02-19
2015-02-18
2015-02-17
2015-02-16
2015-02-15
2015-02-14
2015-02-13


2015-03-20 10:38 GMT+08:00 Li Yang <li...@apache.org>:

> Hi Dong, id 3652427 is illegal. After my fix, the biggest date is
> 9999-12-31. Hope your analysis won't go beyond that point of time. :-)
>
> For your merge problem, you still need to dig why your data generate the
> illegal ID. You can look at DateStrDictionaryTest.java for details of
> what's supported and what's not.
>
> Once data is fixed, refresh impacted segment so dictionary is refreshed to
> correct state, then merge will be able to work.
>
> Cheers
> Yang
>
> On Tue, Mar 17, 2015 at 10:09 AM, dong wang <el...@gmail.com>
> wrote:
>
> > Yang, another thing is that I'm not sure whether the value of
> > "id"(=3652427) is correct or not, if it is legal, then there may be some
> > problems about the function mentioned above, if illegal, there should be
> > some problems inside logic of generating id?
> >
>

Re: error when merging the cube

Posted by Li Yang <li...@apache.org>.
Hi Dong, id 3652427 is illegal. After my fix, the biggest date is
9999-12-31. Hope your analysis won't go beyond that point of time. :-)

For your merge problem, you still need to dig why your data generate the
illegal ID. You can look at DateStrDictionaryTest.java for details of
what's supported and what's not.

Once data is fixed, refresh impacted segment so dictionary is refreshed to
correct state, then merge will be able to work.

Cheers
Yang

On Tue, Mar 17, 2015 at 10:09 AM, dong wang <el...@gmail.com> wrote:

> Yang, another thing is that I'm not sure whether the value of
> "id"(=3652427) is correct or not, if it is legal, then there may be some
> problems about the function mentioned above, if illegal, there should be
> some problems inside logic of generating id?
>

Re: error when merging the cube

Posted by dong wang <el...@gmail.com>.
Yang, another thing is that I'm not sure whether the value of
"id"(=3652427) is correct or not, if it is legal, then there may be some
problems about the function mentioned above, if illegal, there should be
some problems inside logic of generating id?

Re: error when merging the cube

Posted by dong wang <el...@gmail.com>.
Hi yang, how about the issue?  since the cube is built day by day, it the
cube merge feature doesn't work, it will affect the performance if the
count of the segments is big according to the documents~

2015-03-16 13:14 GMT+08:00 Li Yang <li...@apache.org>:

> Interesting.. I'll give a look in the afternoon
>
> On Tue, Mar 10, 2015 at 8:56 PM, dong wang <el...@gmail.com> wrote:
>
> > HI yang, as the log above, it throws this exception from there:
> > System.arraycopy(bytes, 0, returnValue, offset, bytes.length); and
> > as debugged, the value of "bytes" is "10000-01-01", so length of "bytes"
> is
> > 11, and the value of "returnValue" is "9999-12-31", whose length is 10,
> > therefore, the exception is thrown out. what's more, when this exception,
> > the parameter "id" passed to the following function is 3652427, then
> > function getValueBytesFromIdImpl() will internally call
> > getDateFromNumOfDaysSince0000() as below, and, the returned date is
> > "10000-01-01", so, is it possible that there is something that we haven't
> > taken it into consideration for function
> "getDateFromNumOfDaysSince0000()"?
> >
> > final protected int getValueBytesFromIdImpl(int id, byte[] returnValue,
> int
> > offset) {
> >         String date = getValueFromId(id);
> >         byte bytes[];
> >         try {
> >             bytes = date.getBytes("ISO-8859-1");
> >         } catch (UnsupportedEncodingException e) {
> >             throw new RuntimeException(e); // never happen
> >         }
> >         System.arraycopy(bytes, 0, returnValue, offset, bytes.length);
> >         return bytes.length;
> > }
> >
> >
> >     private Date getDateFromNumOfDaysSince0000(int n) {
> >         long millis = ((long) n - 719530) * 86400000;
> >         return new Date(millis);
> >     }
> >
>

Re: error when merging the cube

Posted by Li Yang <li...@apache.org>.
Interesting.. I'll give a look in the afternoon

On Tue, Mar 10, 2015 at 8:56 PM, dong wang <el...@gmail.com> wrote:

> HI yang, as the log above, it throws this exception from there:
> System.arraycopy(bytes, 0, returnValue, offset, bytes.length); and
> as debugged, the value of "bytes" is "10000-01-01", so length of "bytes" is
> 11, and the value of "returnValue" is "9999-12-31", whose length is 10,
> therefore, the exception is thrown out. what's more, when this exception,
> the parameter "id" passed to the following function is 3652427, then
> function getValueBytesFromIdImpl() will internally call
> getDateFromNumOfDaysSince0000() as below, and, the returned date is
> "10000-01-01", so, is it possible that there is something that we haven't
> taken it into consideration for function "getDateFromNumOfDaysSince0000()"?
>
> final protected int getValueBytesFromIdImpl(int id, byte[] returnValue, int
> offset) {
>         String date = getValueFromId(id);
>         byte bytes[];
>         try {
>             bytes = date.getBytes("ISO-8859-1");
>         } catch (UnsupportedEncodingException e) {
>             throw new RuntimeException(e); // never happen
>         }
>         System.arraycopy(bytes, 0, returnValue, offset, bytes.length);
>         return bytes.length;
> }
>
>
>     private Date getDateFromNumOfDaysSince0000(int n) {
>         long millis = ((long) n - 719530) * 86400000;
>         return new Date(millis);
>     }
>

Re: error when merging the cube

Posted by dong wang <el...@gmail.com>.
HI yang, as the log above, it throws this exception from there:
System.arraycopy(bytes, 0, returnValue, offset, bytes.length); and
as debugged, the value of "bytes" is "10000-01-01", so length of "bytes" is
11, and the value of "returnValue" is "9999-12-31", whose length is 10,
therefore, the exception is thrown out. what's more, when this exception,
the parameter "id" passed to the following function is 3652427, then
function getValueBytesFromIdImpl() will internally call
getDateFromNumOfDaysSince0000() as below, and, the returned date is
"10000-01-01", so, is it possible that there is something that we haven't
taken it into consideration for function "getDateFromNumOfDaysSince0000()"?

final protected int getValueBytesFromIdImpl(int id, byte[] returnValue, int
offset) {
        String date = getValueFromId(id);
        byte bytes[];
        try {
            bytes = date.getBytes("ISO-8859-1");
        } catch (UnsupportedEncodingException e) {
            throw new RuntimeException(e); // never happen
        }
        System.arraycopy(bytes, 0, returnValue, offset, bytes.length);
        return bytes.length;
}


    private Date getDateFromNumOfDaysSince0000(int n) {
        long millis = ((long) n - 719530) * 86400000;
        return new Date(millis);
    }

Re: error when merging the cube

Posted by Li Yang <li...@apache.org>.
Interesting... I cannot debug due to lack of the context...

Could you try debug into DictionaryGenerator.mergeDictionaries() ?  The
problem is the buffer array not big enough to hold a dictionary value.
However the array was allocated according to the dictionary's demand at
line 92:  byte[] buffer = new byte[dict.getSizeOfValue()];

Never encountered this one before.

On Mon, Mar 9, 2015 at 2:25 PM, dong wang <el...@gmail.com> wrote:

> building Kylin from source: https://github.com/KylinOLAP/Kylin, when
> selecting 2 segments of the cube to merge, the following error occur:
>
> [pool-7-thread-3]:[2015-03-09
>
> 14:22:01,609][ERROR][org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:134)]
> - ExecuteException job:059d0d3e-fa69-4ef5-b06b-f5625c1599d9
> org.apache.kylin.job.exception.ExecuteException:
> org.apache.kylin.job.exception.ExecuteException:
> java.lang.ArrayIndexOutOfBoundsException
>         at
>
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:102)
>         at
>
> org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:132)
>         at
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.kylin.job.exception.ExecuteException:
> java.lang.ArrayIndexOutOfBoundsException
>         at
>
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:102)
>         at
>
> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:50)
>         at
>
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:99)
>         ... 4 more
> Caused by: java.lang.ArrayIndexOutOfBoundsException
>         at
>
> org.apache.kylin.dict.DateStrDictionary.getValueBytesFromIdImpl(DateStrDictionary.java:191)
>         at
> org.apache.kylin.dict.Dictionary.getValueBytesFromId(Dictionary.java:156)
>         at
>
> org.apache.kylin.dict.DictionaryGenerator.mergeDictionaries(DictionaryGenerator.java:94)
>         at
>
> org.apache.kylin.dict.DictionaryManager.mergeDictionary(DictionaryManager.java:149)
>         at
>
> org.apache.kylin.job.cube.MergeDictionaryStep.mergeDictionaries(MergeDictionaryStep.java:141)
>         at
>
> org.apache.kylin.job.cube.MergeDictionaryStep.makeDictForNewSegment(MergeDictionaryStep.java:131)
>         at
>
> org.apache.kylin.job.cube.MergeDictionaryStep.doWork(MergeDictionaryStep.java:68)
>         at
>
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:99)
>