You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@asterixdb.apache.org by Steven Jacobs <sj...@ucr.edu> on 2016/02/19 07:29:39 UTC

Re: Cannot load an index that is not empty [TreeIndexException]

Hi,
Welcome! We are an Apache incubator project now so I added the correct
mailing list. Our "load" statement only works on an empty dataset.
Subsequent data needs to be added with an insert or a feed. You should be
able to load all 76 files at once though (starting from empty).
Steven

On Thursday, February 18, 2016, Yiran Wang <wy...@gmail.com> wrote:

> Hi Asterix team!
>
> I've come across this error when I was trying to load 76 files into a
> dataset. When I test-loaded the first 32 files, there wasn't such an error.
> All 76 files are of the same data format.
>
> Can you help interpret what this error message means?
>
> Thanks!
> Yiran
>
> --
> Best,
> Yiran
>
> --
> You received this message because you are subscribed to the Google Groups
> "asterixdb-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to asterixdb-dev+unsubscribe@googlegroups.com
> <javascript:_e(%7B%7D,'cvml','asterixdb-dev%2Bunsubscribe@googlegroups.com');>
> .
> For more options, visit https://groups.google.com/d/optout.
>

Re: Cannot load an index that is not empty [TreeIndexException]

Posted by Michael Carey <mj...@ics.uci.edu>.
Thanks, everyone, for the quick investigation and resolution today - 
awesome!
(Looks like we have some error-case user experience work to do to handle 
such cases more cleanly? :-))
Cheers,
Mike

On 2/19/16 4:38 PM, Yiran Wang wrote:
> I'm not sure if the loading issue was a bug. It looks like it was 
> caused by the duplicated primary key in my dataset, which is now 
> solved after I updated the primary key in the dataset. I was not sure 
> what that error message means when I first created this email thread.
>
> Thanks all for jumping on this quickly!
> Yiran
>
>
> On Fri, Feb 19, 2016 at 4:25 PM, Chen Li <chenli@gmail.com 
> <ma...@gmail.com>> wrote:
>
>     It will be good to file a ticket to keep track of this bug.
>
>     Chen
>
>     On Fri, Feb 19, 2016 at 3:40 PM, Young-Seok Kim <kisskys@gmail.com
>     <ma...@gmail.com>> wrote:
>
>         Please let us know how it goes with the change.
>
>         Young-Seok
>
>         On Fri, Feb 19, 2016 at 3:23 PM, Yiran Wang <wyr4137@gmail.com
>         <ma...@gmail.com>> wrote:
>
>             Young-Seok,
>
>             I will just go ahead and change the duplicated keys I have
>             in my original file. That should solve my loading problem.
>             I was describing what's going on in case that is relevant
>             for you to understand why a lot of files still got loaded
>             into the dataset.
>
>             Thanks!
>             Yiran
>
>             On Fri, Feb 19, 2016 at 3:19 PM, Young-Seok Kim
>             <kisskys@gmail.com <ma...@gmail.com>> wrote:
>
>                 Yiran,
>
>                 Could you show all AQLs involved in the loading with
>                 indicating the problematic file which includes the
>                 duplicated primary keys?
>                 Then, we may better understand what's going on and may
>                 get the solution hopefully.
>
>                 On Fri, Feb 19, 2016 at 2:58 PM, Yiran Wang
>                 <wyr4137@gmail.com <ma...@gmail.com>> wrote:
>
>                     Young-Seok,
>
>                     Thank you for your feedback. You are right there
>                     are some duplicated primary keys. It took me some
>                     time, but I did locate the file where the
>                     duplicated primary keys are from.
>
>                     If the load function loads files in sequence as
>                     written in the query, the problematic file is
>                     located towards the end. Maybe that is why there
>                     are still many instances got loaded into the
>                     dataset before it hit the problematic file?
>
>                     Thanks again,
>                     Yiran
>
>                     On Fri, Feb 19, 2016 at 10:53 AM, Young-Seok Kim
>                     <kisskys@gmail.com <ma...@gmail.com>> wrote:
>
>                         By quickly looking at the log, there seems to
>                         exist duplicated primary keys in the files to
>                         be loaded.
>                         That seems the first cause of the problem.
>                         But I'm not sure why the load query continues
>                         trying to load data further instead of stop
>                         when the duplication was found.
>                         This unexpected behavior seems to have
>                         introduced the "Cannot load an index that is
>                         not empty" exception.
>
>                         The following shows the snippet of the
>                         exceptions appeared in the log file attached.
>
>                         ---------------------------------------
>                         SEVERE: Setting uncaught exception handler
>                         edu.uci.ics.hyracks.api.lifecycle.LifeCycleComponentManager@46844c3d
>                         edu.uci.ics.hyracks.api.exceptions.HyracksDataException:
>                         edu.uci.ics.hyracks.api.exceptions.HyracksDataException:
>                         edu.uci.ics.hyracks.storage.am.common.exceptions.TreeIndexDuplicateKeyException:
>                         Input stream given to BTree bulk load has
>                         duplicates.
>                         Caused by:
>                         edu.uci.ics.hyracks.api.exceptions.HyracksDataException:
>                         edu.uci.ics.hyracks.storage.am.common.exceptions.TreeIndexDuplicateKeyException:
>                         Input stream given to BTree bulk load has
>                         duplicates.
>                         Caused by:
>                         edu.uci.ics.hyracks.storage.am.common.exceptions.TreeIndexDuplicateKeyException:
>                         Input stream given to BTree bulk load has
>                         duplicates.
>                         edu.uci.ics.hyracks.api.exceptions.HyracksDataException:
>                         edu.uci.ics.hyracks.storage.am.common.api.TreeIndexException:
>                         Cannot load an index that is not empty
>                         Caused by:
>                         edu.uci.ics.hyracks.storage.am.common.api.TreeIndexException:
>                         Cannot load an index that is not empty
>
>                         Best,
>                         Young-Seok
>
>                         On Fri, Feb 19, 2016 at 10:31 AM, Yiran Wang
>                         <wyr4137@gmail.com <ma...@gmail.com>>
>                         wrote:
>
>                             Abdullah,
>
>                             Here is the log attached. Thank you all
>                             very much for looking into this.
>
>                             Ian - I have two query questions besides
>                             this loading issue. I was wondering if I
>                             can meet briefly with you (or over email)
>                             regarding that.
>
>                             Thanks!
>                             Yiran
>
>                             On Fri, Feb 19, 2016 at 9:38 AM, Mike
>                             Carey <dtabass@gmail.com
>                             <ma...@gmail.com>> wrote:
>
>                                 Maybe Ian can visit the cluster with
>                                 Yiran later today?
>
>                                 On Feb 19, 2016 1:31 AM, "abdullah
>                                 alamoudi" <bamousaa@gmail.com
>                                 <ma...@gmail.com>> wrote:
>
>                                     Yiran,
>                                     Can you share the logs? It would
>                                     help us identifying the actual
>                                     cause of this failure much faster.
>
>                                     I am pretty sure you know this but
>                                     in case you didn't, you can get
>                                     the logs using
>                                     >managix log -n <instance-name>
>
>                                     Also, it would be nice if someone
>                                     from the team has access to the
>                                     cluster so we can work with it
>                                     directly.
>                                     Cheers,
>                                     Abdullah.
>
>
>                                     On Fri, Feb 19, 2016 at 9:40 AM,
>                                     Yiran Wang <wyr4137@gmail.com
>                                     <ma...@gmail.com>> wrote:
>
>                                         Steven,
>
>                                         Thanks for getting back to me
>                                         so quickly! I wasn't clear.
>                                         Here is what happened:
>
>                                         I test-loaded the first 32
>                                         files, no problem. I deleted
>                                         the dataset, created a new
>                                         one, and tried to load the
>                                         entire 76 files into the newly
>                                         created (hence empty) dataset.
>
>                                         It took about 2mins after
>                                         executing the query for the
>                                         error message to show up.
>                                         There are currently 31710406
>                                         rows of data in the dataset,
>                                         despite the error message (so
>                                         it looks like it did load).
>
>                                         So my questions are: 1) why
>                                         did I still get that error
>                                         message when I was loading to
>                                         an empty dataset; and 2) I'm
>                                         not sure if all the data from
>                                         the 76 file are fully loaded.
>                                         Is there other ways to check,
>                                         besides trying to load it
>                                         again and hope this time I
>                                         don't get the error?
>
>                                         Thanks!
>                                         Yiran
>
>                                         On Thu, Feb 18, 2016 at 10:29
>                                         PM, Steven Jacobs
>                                         <sjaco002@ucr.edu
>                                         <ma...@ucr.edu>> wrote:
>
>                                             Hi,
>                                             Welcome! We are an Apache
>                                             incubator project now so I
>                                             added the correct mailing
>                                             list. Our "load" statement
>                                             only works on an empty
>                                             dataset. Subsequent data
>                                             needs to be added with an
>                                             insert or a feed. You
>                                             should be able to load all
>                                             76 files at once though
>                                             (starting from empty).
>                                             Steven
>
>
>                                             On Thursday, February 18,
>                                             2016, Yiran Wang
>                                             <wyr4137@gmail.com
>                                             <ma...@gmail.com>> wrote:
>
>                                                 Hi Asterix team!
>
>                                                 I've come across this
>                                                 error when I was
>                                                 trying to load 76
>                                                 files into a dataset.
>                                                 When I test-loaded the
>                                                 first 32 files, there
>                                                 wasn't such an error.
>                                                 All 76 files are of
>                                                 the same data format.
>
>                                                 Can you help interpret
>                                                 what this error
>                                                 message means?
>
>                                                 Thanks!
>                                                 Yiran
>
>                                                 -- 
>                                                 Best,
>                                                 Yiran
>                                                 -- 
>                                                 You received this
>                                                 message because you
>                                                 are subscribed to the
>                                                 Google Groups
>                                                 "asterixdb-dev" group.
>                                                 To unsubscribe from
>                                                 this group and stop
>                                                 receiving emails from
>                                                 it, send an email to
>                                                 asterixdb-dev+unsubscribe@googlegroups.com.
>                                                 For more options,
>                                                 visit
>                                                 https://groups.google.com/d/optout.
>
>                                             -- 
>                                             You received this message
>                                             because you are subscribed
>                                             to the Google Groups
>                                             "asterixdb-users" group.
>                                             To unsubscribe from this
>                                             group and stop receiving
>                                             emails from it, send an
>                                             email to
>                                             asterixdb-users+unsubscribe@googlegroups.com
>                                             <ma...@googlegroups.com>.
>                                             For more options, visit
>                                             https://groups.google.com/d/optout.
>
>
>
>
>                                         -- 
>                                         Best,
>                                         Yiran
>                                         -- 
>                                         You received this message
>                                         because you are subscribed to
>                                         the Google Groups
>                                         "asterixdb-dev" group.
>                                         To unsubscribe from this group
>                                         and stop receiving emails from
>                                         it, send an email to
>                                         asterixdb-dev+unsubscribe@googlegroups.com
>                                         <ma...@googlegroups.com>.
>                                         For more options, visit
>                                         https://groups.google.com/d/optout.
>
>
>                                     -- 
>                                     You received this message because
>                                     you are subscribed to the Google
>                                     Groups "asterixdb-dev" group.
>                                     To unsubscribe from this group and
>                                     stop receiving emails from it,
>                                     send an email to
>                                     asterixdb-dev+unsubscribe@googlegroups.com
>                                     <ma...@googlegroups.com>.
>                                     For more options, visit
>                                     https://groups.google.com/d/optout.
>
>                                 -- 
>                                 You received this message because you
>                                 are subscribed to the Google Groups
>                                 "asterixdb-users" group.
>                                 To unsubscribe from this group and
>                                 stop receiving emails from it, send an
>                                 email to
>                                 asterixdb-users+unsubscribe@googlegroups.com
>                                 <ma...@googlegroups.com>.
>                                 For more options, visit
>                                 https://groups.google.com/d/optout.
>
>
>
>
>                             -- 
>                             Best,
>                             Yiran
>                             -- 
>                             You received this message because you are
>                             subscribed to the Google Groups
>                             "asterixdb-users" group.
>                             To unsubscribe from this group and stop
>                             receiving emails from it, send an email to
>                             asterixdb-users+unsubscribe@googlegroups.com
>                             <ma...@googlegroups.com>.
>                             For more options, visit
>                             https://groups.google.com/d/optout.
>
>
>                         -- 
>                         You received this message because you are
>                         subscribed to the Google Groups
>                         "asterixdb-users" group.
>                         To unsubscribe from this group and stop
>                         receiving emails from it, send an email to
>                         asterixdb-users+unsubscribe@googlegroups.com
>                         <ma...@googlegroups.com>.
>                         For more options, visit
>                         https://groups.google.com/d/optout.
>
>
>
>
>                     -- 
>                     Best,
>                     Yiran
>                     -- 
>                     You received this message because you are
>                     subscribed to the Google Groups "asterixdb-dev" group.
>                     To unsubscribe from this group and stop receiving
>                     emails from it, send an email to
>                     asterixdb-dev+unsubscribe@googlegroups.com
>                     <ma...@googlegroups.com>.
>                     For more options, visit
>                     https://groups.google.com/d/optout.
>
>
>                 -- 
>                 You received this message because you are subscribed
>                 to the Google Groups "asterixdb-users" group.
>                 To unsubscribe from this group and stop receiving
>                 emails from it, send an email to
>                 asterixdb-users+unsubscribe@googlegroups.com
>                 <ma...@googlegroups.com>.
>                 For more options, visit
>                 https://groups.google.com/d/optout.
>
>
>
>
>             -- 
>             Best,
>             Yiran
>             -- 
>             You received this message because you are subscribed to
>             the Google Groups "asterixdb-dev" group.
>             To unsubscribe from this group and stop receiving emails
>             from it, send an email to
>             asterixdb-dev+unsubscribe@googlegroups.com
>             <ma...@googlegroups.com>.
>             For more options, visit https://groups.google.com/d/optout.
>
>
>         -- 
>         You received this message because you are subscribed to the
>         Google Groups "asterixdb-users" group.
>         To unsubscribe from this group and stop receiving emails from
>         it, send an email to
>         asterixdb-users+unsubscribe@googlegroups.com
>         <ma...@googlegroups.com>.
>         For more options, visit https://groups.google.com/d/optout.
>
>
>     -- 
>     You received this message because you are subscribed to the Google
>     Groups "asterixdb-users" group.
>     To unsubscribe from this group and stop receiving emails from it,
>     send an email to asterixdb-users+unsubscribe@googlegroups.com
>     <ma...@googlegroups.com>.
>     For more options, visit https://groups.google.com/d/optout.
>
>
>
>
> -- 
> Best,
> Yiran
> -- 
> You received this message because you are subscribed to the Google 
> Groups "asterixdb-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send 
> an email to asterixdb-dev+unsubscribe@googlegroups.com 
> <ma...@googlegroups.com>.
> For more options, visit https://groups.google.com/d/optout.


Re: Cannot load an index that is not empty [TreeIndexException]

Posted by Chen Li <ch...@gmail.com>.
It will be good to file a ticket to keep track of this bug.

Chen

On Fri, Feb 19, 2016 at 3:40 PM, Young-Seok Kim <ki...@gmail.com> wrote:

> Please let us know how it goes with the change.
>
> Young-Seok
>
> On Fri, Feb 19, 2016 at 3:23 PM, Yiran Wang <wy...@gmail.com> wrote:
>
>> Young-Seok,
>>
>> I will just go ahead and change the duplicated keys I have in my original
>> file. That should solve my loading problem. I was describing what's going
>> on in case that is relevant for you to understand why a lot of files still
>> got loaded into the dataset.
>>
>> Thanks!
>> Yiran
>>
>> On Fri, Feb 19, 2016 at 3:19 PM, Young-Seok Kim <ki...@gmail.com>
>> wrote:
>>
>>> Yiran,
>>>
>>> Could you show all AQLs involved in the loading with indicating the
>>> problematic file which includes the duplicated primary keys?
>>> Then, we may better understand what's going on and may get the solution
>>> hopefully.
>>>
>>> On Fri, Feb 19, 2016 at 2:58 PM, Yiran Wang <wy...@gmail.com> wrote:
>>>
>>>> Young-Seok,
>>>>
>>>> Thank you for your feedback. You are right there are some duplicated
>>>> primary keys. It took me some time, but I did locate the file where the
>>>> duplicated primary keys are from.
>>>>
>>>> If the load function loads files in sequence as written in the query,
>>>> the problematic file is located towards the end. Maybe that is why there
>>>> are still many instances got loaded into the dataset before it hit the
>>>> problematic file?
>>>>
>>>> Thanks again,
>>>> Yiran
>>>>
>>>> On Fri, Feb 19, 2016 at 10:53 AM, Young-Seok Kim <ki...@gmail.com>
>>>> wrote:
>>>>
>>>>> By quickly looking at the log, there seems to exist duplicated primary
>>>>> keys in the files to be loaded.
>>>>> That seems the first cause of the problem.
>>>>> But I'm not sure why the load query continues trying to load data
>>>>> further instead of stop when the duplication was found.
>>>>> This unexpected behavior seems to have introduced the "Cannot load an
>>>>> index that is not empty" exception.
>>>>>
>>>>> The following shows the snippet of the exceptions appeared in the log
>>>>> file attached.
>>>>>
>>>>> ---------------------------------------
>>>>> SEVERE: Setting uncaught exception handler
>>>>> edu.uci.ics.hyracks.api.lifecycle.LifeCycleComponentManager@46844c3d
>>>>> edu.uci.ics.hyracks.api.exceptions.HyracksDataException:
>>>>> edu.uci.ics.hyracks.api.exceptions.HyracksDataException:
>>>>> edu.uci.ics.hyracks.storage.am.common.exceptions.TreeIndexDuplicateKeyException:
>>>>> Input stream given to BTree bulk load has duplicates.
>>>>> Caused by: edu.uci.ics.hyracks.api.exceptions.HyracksDataException:
>>>>> edu.uci.ics.hyracks.storage.am.common.exceptions.TreeIndexDuplicateKeyException:
>>>>> Input stream given to BTree bulk load has duplicates.
>>>>> Caused by:
>>>>> edu.uci.ics.hyracks.storage.am.common.exceptions.TreeIndexDuplicateKeyException:
>>>>> Input stream given to BTree bulk load has duplicates.
>>>>> edu.uci.ics.hyracks.api.exceptions.HyracksDataException:
>>>>> edu.uci.ics.hyracks.storage.am.common.api.TreeIndexException: Cannot load
>>>>> an index that is not empty
>>>>> Caused by:
>>>>> edu.uci.ics.hyracks.storage.am.common.api.TreeIndexException: Cannot load
>>>>> an index that is not empty
>>>>>
>>>>> Best,
>>>>> Young-Seok
>>>>>
>>>>> On Fri, Feb 19, 2016 at 10:31 AM, Yiran Wang <wy...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Abdullah,
>>>>>>
>>>>>> Here is the log attached. Thank you all very much for looking into
>>>>>> this.
>>>>>>
>>>>>> Ian - I have two query questions besides this loading issue. I was
>>>>>> wondering if I can meet briefly with you (or over email) regarding that.
>>>>>>
>>>>>> Thanks!
>>>>>> Yiran
>>>>>>
>>>>>> On Fri, Feb 19, 2016 at 9:38 AM, Mike Carey <dt...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Maybe Ian can visit the cluster with Yiran later today?
>>>>>>> On Feb 19, 2016 1:31 AM, "abdullah alamoudi" <ba...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Yiran,
>>>>>>>> Can you share the logs? It would help us identifying the actual
>>>>>>>> cause of this failure much faster.
>>>>>>>>
>>>>>>>> I am pretty sure you know this but in case you didn't, you can get
>>>>>>>> the logs using
>>>>>>>> >managix log -n <instance-name>
>>>>>>>>
>>>>>>>> Also, it would be nice if someone from the team has access to the
>>>>>>>> cluster so we can work with it directly.
>>>>>>>> Cheers,
>>>>>>>> Abdullah.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Feb 19, 2016 at 9:40 AM, Yiran Wang <wy...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Steven,
>>>>>>>>>
>>>>>>>>> Thanks for getting back to me so quickly! I wasn't clear. Here is
>>>>>>>>> what happened:
>>>>>>>>>
>>>>>>>>> I test-loaded the first 32 files, no problem. I deleted the
>>>>>>>>> dataset, created a new one, and tried to load the entire 76 files into the
>>>>>>>>> newly created (hence empty) dataset.
>>>>>>>>>
>>>>>>>>> It took about 2mins after executing the query for the error
>>>>>>>>> message to show up. There are currently 31710406 rows of data in the
>>>>>>>>> dataset, despite the error message (so it looks like it did load).
>>>>>>>>>
>>>>>>>>> So my questions are: 1) why did I still get that error message
>>>>>>>>> when I was loading to an empty dataset; and 2) I'm not sure if all the data
>>>>>>>>> from the 76 file are fully loaded. Is there other ways to check, besides
>>>>>>>>> trying to load it again and hope this time I don't get the error?
>>>>>>>>>
>>>>>>>>> Thanks!
>>>>>>>>> Yiran
>>>>>>>>>
>>>>>>>>> On Thu, Feb 18, 2016 at 10:29 PM, Steven Jacobs <sj...@ucr.edu>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>> Welcome! We are an Apache incubator project now so I added the
>>>>>>>>>> correct mailing list. Our "load" statement only works on an empty dataset.
>>>>>>>>>> Subsequent data needs to be added with an insert or a feed. You should be
>>>>>>>>>> able to load all 76 files at once though (starting from empty).
>>>>>>>>>> Steven
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Thursday, February 18, 2016, Yiran Wang <wy...@gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Asterix team!
>>>>>>>>>>>
>>>>>>>>>>> I've come across this error when I was trying to load 76 files
>>>>>>>>>>> into a dataset. When I test-loaded the first 32 files, there wasn't such an
>>>>>>>>>>> error. All 76 files are of the same data format.
>>>>>>>>>>>
>>>>>>>>>>> Can you help interpret what this error message means?
>>>>>>>>>>>
>>>>>>>>>>> Thanks!
>>>>>>>>>>> Yiran
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Best,
>>>>>>>>>>> Yiran
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> You received this message because you are subscribed to the
>>>>>>>>>>> Google Groups "asterixdb-dev" group.
>>>>>>>>>>> To unsubscribe from this group and stop receiving emails from
>>>>>>>>>>> it, send an email to asterixdb-dev+unsubscribe@googlegroups.com.
>>>>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> You received this message because you are subscribed to the
>>>>>>>>>> Google Groups "asterixdb-users" group.
>>>>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>>>>> send an email to asterixdb-users+unsubscribe@googlegroups.com.
>>>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Best,
>>>>>>>>> Yiran
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> You received this message because you are subscribed to the Google
>>>>>>>>> Groups "asterixdb-dev" group.
>>>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>>>> send an email to asterixdb-dev+unsubscribe@googlegroups.com.
>>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> You received this message because you are subscribed to the Google
>>>>>>>> Groups "asterixdb-dev" group.
>>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>>> send an email to asterixdb-dev+unsubscribe@googlegroups.com.
>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>
>>>>>>> --
>>>>>>> You received this message because you are subscribed to the Google
>>>>>>> Groups "asterixdb-users" group.
>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>> send an email to asterixdb-users+unsubscribe@googlegroups.com.
>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Best,
>>>>>> Yiran
>>>>>>
>>>>>> --
>>>>>> You received this message because you are subscribed to the Google
>>>>>> Groups "asterixdb-users" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>> send an email to asterixdb-users+unsubscribe@googlegroups.com.
>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>
>>>>>
>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "asterixdb-users" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to asterixdb-users+unsubscribe@googlegroups.com.
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Best,
>>>> Yiran
>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "asterixdb-dev" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to asterixdb-dev+unsubscribe@googlegroups.com.
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "asterixdb-users" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to asterixdb-users+unsubscribe@googlegroups.com.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>
>>
>> --
>> Best,
>> Yiran
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "asterixdb-dev" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to asterixdb-dev+unsubscribe@googlegroups.com.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
> --
> You received this message because you are subscribed to the Google Groups
> "asterixdb-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to asterixdb-users+unsubscribe@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

Re: Cannot load an index that is not empty [TreeIndexException]

Posted by Young-Seok Kim <ki...@gmail.com>.
Please let us know how it goes with the change.

Young-Seok

On Fri, Feb 19, 2016 at 3:23 PM, Yiran Wang <wy...@gmail.com> wrote:

> Young-Seok,
>
> I will just go ahead and change the duplicated keys I have in my original
> file. That should solve my loading problem. I was describing what's going
> on in case that is relevant for you to understand why a lot of files still
> got loaded into the dataset.
>
> Thanks!
> Yiran
>
> On Fri, Feb 19, 2016 at 3:19 PM, Young-Seok Kim <ki...@gmail.com> wrote:
>
>> Yiran,
>>
>> Could you show all AQLs involved in the loading with indicating the
>> problematic file which includes the duplicated primary keys?
>> Then, we may better understand what's going on and may get the solution
>> hopefully.
>>
>> On Fri, Feb 19, 2016 at 2:58 PM, Yiran Wang <wy...@gmail.com> wrote:
>>
>>> Young-Seok,
>>>
>>> Thank you for your feedback. You are right there are some duplicated
>>> primary keys. It took me some time, but I did locate the file where the
>>> duplicated primary keys are from.
>>>
>>> If the load function loads files in sequence as written in the query,
>>> the problematic file is located towards the end. Maybe that is why there
>>> are still many instances got loaded into the dataset before it hit the
>>> problematic file?
>>>
>>> Thanks again,
>>> Yiran
>>>
>>> On Fri, Feb 19, 2016 at 10:53 AM, Young-Seok Kim <ki...@gmail.com>
>>> wrote:
>>>
>>>> By quickly looking at the log, there seems to exist duplicated primary
>>>> keys in the files to be loaded.
>>>> That seems the first cause of the problem.
>>>> But I'm not sure why the load query continues trying to load data
>>>> further instead of stop when the duplication was found.
>>>> This unexpected behavior seems to have introduced the "Cannot load an
>>>> index that is not empty" exception.
>>>>
>>>> The following shows the snippet of the exceptions appeared in the log
>>>> file attached.
>>>>
>>>> ---------------------------------------
>>>> SEVERE: Setting uncaught exception handler
>>>> edu.uci.ics.hyracks.api.lifecycle.LifeCycleComponentManager@46844c3d
>>>> edu.uci.ics.hyracks.api.exceptions.HyracksDataException:
>>>> edu.uci.ics.hyracks.api.exceptions.HyracksDataException:
>>>> edu.uci.ics.hyracks.storage.am.common.exceptions.TreeIndexDuplicateKeyException:
>>>> Input stream given to BTree bulk load has duplicates.
>>>> Caused by: edu.uci.ics.hyracks.api.exceptions.HyracksDataException:
>>>> edu.uci.ics.hyracks.storage.am.common.exceptions.TreeIndexDuplicateKeyException:
>>>> Input stream given to BTree bulk load has duplicates.
>>>> Caused by:
>>>> edu.uci.ics.hyracks.storage.am.common.exceptions.TreeIndexDuplicateKeyException:
>>>> Input stream given to BTree bulk load has duplicates.
>>>> edu.uci.ics.hyracks.api.exceptions.HyracksDataException:
>>>> edu.uci.ics.hyracks.storage.am.common.api.TreeIndexException: Cannot load
>>>> an index that is not empty
>>>> Caused by:
>>>> edu.uci.ics.hyracks.storage.am.common.api.TreeIndexException: Cannot load
>>>> an index that is not empty
>>>>
>>>> Best,
>>>> Young-Seok
>>>>
>>>> On Fri, Feb 19, 2016 at 10:31 AM, Yiran Wang <wy...@gmail.com> wrote:
>>>>
>>>>> Abdullah,
>>>>>
>>>>> Here is the log attached. Thank you all very much for looking into
>>>>> this.
>>>>>
>>>>> Ian - I have two query questions besides this loading issue. I was
>>>>> wondering if I can meet briefly with you (or over email) regarding that.
>>>>>
>>>>> Thanks!
>>>>> Yiran
>>>>>
>>>>> On Fri, Feb 19, 2016 at 9:38 AM, Mike Carey <dt...@gmail.com> wrote:
>>>>>
>>>>>> Maybe Ian can visit the cluster with Yiran later today?
>>>>>> On Feb 19, 2016 1:31 AM, "abdullah alamoudi" <ba...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Yiran,
>>>>>>> Can you share the logs? It would help us identifying the actual
>>>>>>> cause of this failure much faster.
>>>>>>>
>>>>>>> I am pretty sure you know this but in case you didn't, you can get
>>>>>>> the logs using
>>>>>>> >managix log -n <instance-name>
>>>>>>>
>>>>>>> Also, it would be nice if someone from the team has access to the
>>>>>>> cluster so we can work with it directly.
>>>>>>> Cheers,
>>>>>>> Abdullah.
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Feb 19, 2016 at 9:40 AM, Yiran Wang <wy...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Steven,
>>>>>>>>
>>>>>>>> Thanks for getting back to me so quickly! I wasn't clear. Here is
>>>>>>>> what happened:
>>>>>>>>
>>>>>>>> I test-loaded the first 32 files, no problem. I deleted the
>>>>>>>> dataset, created a new one, and tried to load the entire 76 files into the
>>>>>>>> newly created (hence empty) dataset.
>>>>>>>>
>>>>>>>> It took about 2mins after executing the query for the error message
>>>>>>>> to show up. There are currently 31710406 rows of data in the dataset,
>>>>>>>> despite the error message (so it looks like it did load).
>>>>>>>>
>>>>>>>> So my questions are: 1) why did I still get that error message when
>>>>>>>> I was loading to an empty dataset; and 2) I'm not sure if all the data from
>>>>>>>> the 76 file are fully loaded. Is there other ways to check, besides trying
>>>>>>>> to load it again and hope this time I don't get the error?
>>>>>>>>
>>>>>>>> Thanks!
>>>>>>>> Yiran
>>>>>>>>
>>>>>>>> On Thu, Feb 18, 2016 at 10:29 PM, Steven Jacobs <sj...@ucr.edu>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>> Welcome! We are an Apache incubator project now so I added the
>>>>>>>>> correct mailing list. Our "load" statement only works on an empty dataset.
>>>>>>>>> Subsequent data needs to be added with an insert or a feed. You should be
>>>>>>>>> able to load all 76 files at once though (starting from empty).
>>>>>>>>> Steven
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Thursday, February 18, 2016, Yiran Wang <wy...@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Asterix team!
>>>>>>>>>>
>>>>>>>>>> I've come across this error when I was trying to load 76 files
>>>>>>>>>> into a dataset. When I test-loaded the first 32 files, there wasn't such an
>>>>>>>>>> error. All 76 files are of the same data format.
>>>>>>>>>>
>>>>>>>>>> Can you help interpret what this error message means?
>>>>>>>>>>
>>>>>>>>>> Thanks!
>>>>>>>>>> Yiran
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Best,
>>>>>>>>>> Yiran
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> You received this message because you are subscribed to the
>>>>>>>>>> Google Groups "asterixdb-dev" group.
>>>>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>>>>> send an email to asterixdb-dev+unsubscribe@googlegroups.com.
>>>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> You received this message because you are subscribed to the Google
>>>>>>>>> Groups "asterixdb-users" group.
>>>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>>>> send an email to asterixdb-users+unsubscribe@googlegroups.com.
>>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Best,
>>>>>>>> Yiran
>>>>>>>>
>>>>>>>> --
>>>>>>>> You received this message because you are subscribed to the Google
>>>>>>>> Groups "asterixdb-dev" group.
>>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>>> send an email to asterixdb-dev+unsubscribe@googlegroups.com.
>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> You received this message because you are subscribed to the Google
>>>>>>> Groups "asterixdb-dev" group.
>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>> send an email to asterixdb-dev+unsubscribe@googlegroups.com.
>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>
>>>>>> --
>>>>>> You received this message because you are subscribed to the Google
>>>>>> Groups "asterixdb-users" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>> send an email to asterixdb-users+unsubscribe@googlegroups.com.
>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best,
>>>>> Yiran
>>>>>
>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "asterixdb-users" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to asterixdb-users+unsubscribe@googlegroups.com.
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "asterixdb-users" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to asterixdb-users+unsubscribe@googlegroups.com.
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>>
>>>
>>> --
>>> Best,
>>> Yiran
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "asterixdb-dev" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to asterixdb-dev+unsubscribe@googlegroups.com.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "asterixdb-users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to asterixdb-users+unsubscribe@googlegroups.com.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
>
> --
> Best,
> Yiran
>
> --
> You received this message because you are subscribed to the Google Groups
> "asterixdb-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to asterixdb-dev+unsubscribe@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

Re: Cannot load an index that is not empty [TreeIndexException]

Posted by Yiran Wang <wy...@gmail.com>.
Young-Seok,

I will just go ahead and change the duplicated keys I have in my original
file. That should solve my loading problem. I was describing what's going
on in case that is relevant for you to understand why a lot of files still
got loaded into the dataset.

Thanks!
Yiran

On Fri, Feb 19, 2016 at 3:19 PM, Young-Seok Kim <ki...@gmail.com> wrote:

> Yiran,
>
> Could you show all AQLs involved in the loading with indicating the
> problematic file which includes the duplicated primary keys?
> Then, we may better understand what's going on and may get the solution
> hopefully.
>
> On Fri, Feb 19, 2016 at 2:58 PM, Yiran Wang <wy...@gmail.com> wrote:
>
>> Young-Seok,
>>
>> Thank you for your feedback. You are right there are some duplicated
>> primary keys. It took me some time, but I did locate the file where the
>> duplicated primary keys are from.
>>
>> If the load function loads files in sequence as written in the query, the
>> problematic file is located towards the end. Maybe that is why there are
>> still many instances got loaded into the dataset before it hit the
>> problematic file?
>>
>> Thanks again,
>> Yiran
>>
>> On Fri, Feb 19, 2016 at 10:53 AM, Young-Seok Kim <ki...@gmail.com>
>> wrote:
>>
>>> By quickly looking at the log, there seems to exist duplicated primary
>>> keys in the files to be loaded.
>>> That seems the first cause of the problem.
>>> But I'm not sure why the load query continues trying to load data
>>> further instead of stop when the duplication was found.
>>> This unexpected behavior seems to have introduced the "Cannot load an
>>> index that is not empty" exception.
>>>
>>> The following shows the snippet of the exceptions appeared in the log
>>> file attached.
>>>
>>> ---------------------------------------
>>> SEVERE: Setting uncaught exception handler
>>> edu.uci.ics.hyracks.api.lifecycle.LifeCycleComponentManager@46844c3d
>>> edu.uci.ics.hyracks.api.exceptions.HyracksDataException:
>>> edu.uci.ics.hyracks.api.exceptions.HyracksDataException:
>>> edu.uci.ics.hyracks.storage.am.common.exceptions.TreeIndexDuplicateKeyException:
>>> Input stream given to BTree bulk load has duplicates.
>>> Caused by: edu.uci.ics.hyracks.api.exceptions.HyracksDataException:
>>> edu.uci.ics.hyracks.storage.am.common.exceptions.TreeIndexDuplicateKeyException:
>>> Input stream given to BTree bulk load has duplicates.
>>> Caused by:
>>> edu.uci.ics.hyracks.storage.am.common.exceptions.TreeIndexDuplicateKeyException:
>>> Input stream given to BTree bulk load has duplicates.
>>> edu.uci.ics.hyracks.api.exceptions.HyracksDataException:
>>> edu.uci.ics.hyracks.storage.am.common.api.TreeIndexException: Cannot load
>>> an index that is not empty
>>> Caused by: edu.uci.ics.hyracks.storage.am.common.api.TreeIndexException:
>>> Cannot load an index that is not empty
>>>
>>> Best,
>>> Young-Seok
>>>
>>> On Fri, Feb 19, 2016 at 10:31 AM, Yiran Wang <wy...@gmail.com> wrote:
>>>
>>>> Abdullah,
>>>>
>>>> Here is the log attached. Thank you all very much for looking into this.
>>>>
>>>> Ian - I have two query questions besides this loading issue. I was
>>>> wondering if I can meet briefly with you (or over email) regarding that.
>>>>
>>>> Thanks!
>>>> Yiran
>>>>
>>>> On Fri, Feb 19, 2016 at 9:38 AM, Mike Carey <dt...@gmail.com> wrote:
>>>>
>>>>> Maybe Ian can visit the cluster with Yiran later today?
>>>>> On Feb 19, 2016 1:31 AM, "abdullah alamoudi" <ba...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Yiran,
>>>>>> Can you share the logs? It would help us identifying the actual cause
>>>>>> of this failure much faster.
>>>>>>
>>>>>> I am pretty sure you know this but in case you didn't, you can get
>>>>>> the logs using
>>>>>> >managix log -n <instance-name>
>>>>>>
>>>>>> Also, it would be nice if someone from the team has access to the
>>>>>> cluster so we can work with it directly.
>>>>>> Cheers,
>>>>>> Abdullah.
>>>>>>
>>>>>>
>>>>>> On Fri, Feb 19, 2016 at 9:40 AM, Yiran Wang <wy...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Steven,
>>>>>>>
>>>>>>> Thanks for getting back to me so quickly! I wasn't clear. Here is
>>>>>>> what happened:
>>>>>>>
>>>>>>> I test-loaded the first 32 files, no problem. I deleted the dataset,
>>>>>>> created a new one, and tried to load the entire 76 files into the newly
>>>>>>> created (hence empty) dataset.
>>>>>>>
>>>>>>> It took about 2mins after executing the query for the error message
>>>>>>> to show up. There are currently 31710406 rows of data in the dataset,
>>>>>>> despite the error message (so it looks like it did load).
>>>>>>>
>>>>>>> So my questions are: 1) why did I still get that error message when
>>>>>>> I was loading to an empty dataset; and 2) I'm not sure if all the data from
>>>>>>> the 76 file are fully loaded. Is there other ways to check, besides trying
>>>>>>> to load it again and hope this time I don't get the error?
>>>>>>>
>>>>>>> Thanks!
>>>>>>> Yiran
>>>>>>>
>>>>>>> On Thu, Feb 18, 2016 at 10:29 PM, Steven Jacobs <sj...@ucr.edu>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>> Welcome! We are an Apache incubator project now so I added the
>>>>>>>> correct mailing list. Our "load" statement only works on an empty dataset.
>>>>>>>> Subsequent data needs to be added with an insert or a feed. You should be
>>>>>>>> able to load all 76 files at once though (starting from empty).
>>>>>>>> Steven
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thursday, February 18, 2016, Yiran Wang <wy...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi Asterix team!
>>>>>>>>>
>>>>>>>>> I've come across this error when I was trying to load 76 files
>>>>>>>>> into a dataset. When I test-loaded the first 32 files, there wasn't such an
>>>>>>>>> error. All 76 files are of the same data format.
>>>>>>>>>
>>>>>>>>> Can you help interpret what this error message means?
>>>>>>>>>
>>>>>>>>> Thanks!
>>>>>>>>> Yiran
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Best,
>>>>>>>>> Yiran
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> You received this message because you are subscribed to the Google
>>>>>>>>> Groups "asterixdb-dev" group.
>>>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>>>> send an email to asterixdb-dev+unsubscribe@googlegroups.com.
>>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>>
>>>>>>>> --
>>>>>>>> You received this message because you are subscribed to the Google
>>>>>>>> Groups "asterixdb-users" group.
>>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>>> send an email to asterixdb-users+unsubscribe@googlegroups.com.
>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Best,
>>>>>>> Yiran
>>>>>>>
>>>>>>> --
>>>>>>> You received this message because you are subscribed to the Google
>>>>>>> Groups "asterixdb-dev" group.
>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>> send an email to asterixdb-dev+unsubscribe@googlegroups.com.
>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>
>>>>>>
>>>>>> --
>>>>>> You received this message because you are subscribed to the Google
>>>>>> Groups "asterixdb-dev" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>> send an email to asterixdb-dev+unsubscribe@googlegroups.com.
>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>
>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "asterixdb-users" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to asterixdb-users+unsubscribe@googlegroups.com.
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Best,
>>>> Yiran
>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "asterixdb-users" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to asterixdb-users+unsubscribe@googlegroups.com.
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "asterixdb-users" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to asterixdb-users+unsubscribe@googlegroups.com.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>
>>
>> --
>> Best,
>> Yiran
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "asterixdb-dev" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to asterixdb-dev+unsubscribe@googlegroups.com.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
> --
> You received this message because you are subscribed to the Google Groups
> "asterixdb-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to asterixdb-users+unsubscribe@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Best,
Yiran

Re: Cannot load an index that is not empty [TreeIndexException]

Posted by Young-Seok Kim <ki...@gmail.com>.
Yiran,

Could you show all AQLs involved in the loading with indicating the
problematic file which includes the duplicated primary keys?
Then, we may better understand what's going on and may get the solution
hopefully.

On Fri, Feb 19, 2016 at 2:58 PM, Yiran Wang <wy...@gmail.com> wrote:

> Young-Seok,
>
> Thank you for your feedback. You are right there are some duplicated
> primary keys. It took me some time, but I did locate the file where the
> duplicated primary keys are from.
>
> If the load function loads files in sequence as written in the query, the
> problematic file is located towards the end. Maybe that is why there are
> still many instances got loaded into the dataset before it hit the
> problematic file?
>
> Thanks again,
> Yiran
>
> On Fri, Feb 19, 2016 at 10:53 AM, Young-Seok Kim <ki...@gmail.com>
> wrote:
>
>> By quickly looking at the log, there seems to exist duplicated primary
>> keys in the files to be loaded.
>> That seems the first cause of the problem.
>> But I'm not sure why the load query continues trying to load data further
>> instead of stop when the duplication was found.
>> This unexpected behavior seems to have introduced the "Cannot load an
>> index that is not empty" exception.
>>
>> The following shows the snippet of the exceptions appeared in the log
>> file attached.
>>
>> ---------------------------------------
>> SEVERE: Setting uncaught exception handler
>> edu.uci.ics.hyracks.api.lifecycle.LifeCycleComponentManager@46844c3d
>> edu.uci.ics.hyracks.api.exceptions.HyracksDataException:
>> edu.uci.ics.hyracks.api.exceptions.HyracksDataException:
>> edu.uci.ics.hyracks.storage.am.common.exceptions.TreeIndexDuplicateKeyException:
>> Input stream given to BTree bulk load has duplicates.
>> Caused by: edu.uci.ics.hyracks.api.exceptions.HyracksDataException:
>> edu.uci.ics.hyracks.storage.am.common.exceptions.TreeIndexDuplicateKeyException:
>> Input stream given to BTree bulk load has duplicates.
>> Caused by:
>> edu.uci.ics.hyracks.storage.am.common.exceptions.TreeIndexDuplicateKeyException:
>> Input stream given to BTree bulk load has duplicates.
>> edu.uci.ics.hyracks.api.exceptions.HyracksDataException:
>> edu.uci.ics.hyracks.storage.am.common.api.TreeIndexException: Cannot load
>> an index that is not empty
>> Caused by: edu.uci.ics.hyracks.storage.am.common.api.TreeIndexException:
>> Cannot load an index that is not empty
>>
>> Best,
>> Young-Seok
>>
>> On Fri, Feb 19, 2016 at 10:31 AM, Yiran Wang <wy...@gmail.com> wrote:
>>
>>> Abdullah,
>>>
>>> Here is the log attached. Thank you all very much for looking into this.
>>>
>>> Ian - I have two query questions besides this loading issue. I was
>>> wondering if I can meet briefly with you (or over email) regarding that.
>>>
>>> Thanks!
>>> Yiran
>>>
>>> On Fri, Feb 19, 2016 at 9:38 AM, Mike Carey <dt...@gmail.com> wrote:
>>>
>>>> Maybe Ian can visit the cluster with Yiran later today?
>>>> On Feb 19, 2016 1:31 AM, "abdullah alamoudi" <ba...@gmail.com>
>>>> wrote:
>>>>
>>>>> Yiran,
>>>>> Can you share the logs? It would help us identifying the actual cause
>>>>> of this failure much faster.
>>>>>
>>>>> I am pretty sure you know this but in case you didn't, you can get the
>>>>> logs using
>>>>> >managix log -n <instance-name>
>>>>>
>>>>> Also, it would be nice if someone from the team has access to the
>>>>> cluster so we can work with it directly.
>>>>> Cheers,
>>>>> Abdullah.
>>>>>
>>>>>
>>>>> On Fri, Feb 19, 2016 at 9:40 AM, Yiran Wang <wy...@gmail.com> wrote:
>>>>>
>>>>>> Steven,
>>>>>>
>>>>>> Thanks for getting back to me so quickly! I wasn't clear. Here is
>>>>>> what happened:
>>>>>>
>>>>>> I test-loaded the first 32 files, no problem. I deleted the dataset,
>>>>>> created a new one, and tried to load the entire 76 files into the newly
>>>>>> created (hence empty) dataset.
>>>>>>
>>>>>> It took about 2mins after executing the query for the error message
>>>>>> to show up. There are currently 31710406 rows of data in the dataset,
>>>>>> despite the error message (so it looks like it did load).
>>>>>>
>>>>>> So my questions are: 1) why did I still get that error message when I
>>>>>> was loading to an empty dataset; and 2) I'm not sure if all the data from
>>>>>> the 76 file are fully loaded. Is there other ways to check, besides trying
>>>>>> to load it again and hope this time I don't get the error?
>>>>>>
>>>>>> Thanks!
>>>>>> Yiran
>>>>>>
>>>>>> On Thu, Feb 18, 2016 at 10:29 PM, Steven Jacobs <sj...@ucr.edu>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>> Welcome! We are an Apache incubator project now so I added the
>>>>>>> correct mailing list. Our "load" statement only works on an empty dataset.
>>>>>>> Subsequent data needs to be added with an insert or a feed. You should be
>>>>>>> able to load all 76 files at once though (starting from empty).
>>>>>>> Steven
>>>>>>>
>>>>>>>
>>>>>>> On Thursday, February 18, 2016, Yiran Wang <wy...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi Asterix team!
>>>>>>>>
>>>>>>>> I've come across this error when I was trying to load 76 files into
>>>>>>>> a dataset. When I test-loaded the first 32 files, there wasn't such an
>>>>>>>> error. All 76 files are of the same data format.
>>>>>>>>
>>>>>>>> Can you help interpret what this error message means?
>>>>>>>>
>>>>>>>> Thanks!
>>>>>>>> Yiran
>>>>>>>>
>>>>>>>> --
>>>>>>>> Best,
>>>>>>>> Yiran
>>>>>>>>
>>>>>>>> --
>>>>>>>> You received this message because you are subscribed to the Google
>>>>>>>> Groups "asterixdb-dev" group.
>>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>>> send an email to asterixdb-dev+unsubscribe@googlegroups.com.
>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>
>>>>>>> --
>>>>>>> You received this message because you are subscribed to the Google
>>>>>>> Groups "asterixdb-users" group.
>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>> send an email to asterixdb-users+unsubscribe@googlegroups.com.
>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Best,
>>>>>> Yiran
>>>>>>
>>>>>> --
>>>>>> You received this message because you are subscribed to the Google
>>>>>> Groups "asterixdb-dev" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>> send an email to asterixdb-dev+unsubscribe@googlegroups.com.
>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>
>>>>>
>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "asterixdb-dev" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to asterixdb-dev+unsubscribe@googlegroups.com.
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "asterixdb-users" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to asterixdb-users+unsubscribe@googlegroups.com.
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>>
>>>
>>> --
>>> Best,
>>> Yiran
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "asterixdb-users" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to asterixdb-users+unsubscribe@googlegroups.com.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "asterixdb-users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to asterixdb-users+unsubscribe@googlegroups.com.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
>
> --
> Best,
> Yiran
>
> --
> You received this message because you are subscribed to the Google Groups
> "asterixdb-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to asterixdb-dev+unsubscribe@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

Re: Cannot load an index that is not empty [TreeIndexException]

Posted by Young-Seok Kim <ki...@gmail.com>.
By quickly looking at the log, there seems to exist duplicated primary keys
in the files to be loaded.
That seems the first cause of the problem.
But I'm not sure why the load query continues trying to load data further
instead of stop when the duplication was found.
This unexpected behavior seems to have introduced the "Cannot load an index
that is not empty" exception.

The following shows the snippet of the exceptions appeared in the log file
attached.

---------------------------------------
SEVERE: Setting uncaught exception handler
edu.uci.ics.hyracks.api.lifecycle.LifeCycleComponentManager@46844c3d
edu.uci.ics.hyracks.api.exceptions.HyracksDataException:
edu.uci.ics.hyracks.api.exceptions.HyracksDataException:
edu.uci.ics.hyracks.storage.am.common.exceptions.TreeIndexDuplicateKeyException:
Input stream given to BTree bulk load has duplicates.
Caused by: edu.uci.ics.hyracks.api.exceptions.HyracksDataException:
edu.uci.ics.hyracks.storage.am.common.exceptions.TreeIndexDuplicateKeyException:
Input stream given to BTree bulk load has duplicates.
Caused by:
edu.uci.ics.hyracks.storage.am.common.exceptions.TreeIndexDuplicateKeyException:
Input stream given to BTree bulk load has duplicates.
edu.uci.ics.hyracks.api.exceptions.HyracksDataException:
edu.uci.ics.hyracks.storage.am.common.api.TreeIndexException: Cannot load
an index that is not empty
Caused by: edu.uci.ics.hyracks.storage.am.common.api.TreeIndexException:
Cannot load an index that is not empty

Best,
Young-Seok

On Fri, Feb 19, 2016 at 10:31 AM, Yiran Wang <wy...@gmail.com> wrote:

> Abdullah,
>
> Here is the log attached. Thank you all very much for looking into this.
>
> Ian - I have two query questions besides this loading issue. I was
> wondering if I can meet briefly with you (or over email) regarding that.
>
> Thanks!
> Yiran
>
> On Fri, Feb 19, 2016 at 9:38 AM, Mike Carey <dt...@gmail.com> wrote:
>
>> Maybe Ian can visit the cluster with Yiran later today?
>> On Feb 19, 2016 1:31 AM, "abdullah alamoudi" <ba...@gmail.com> wrote:
>>
>>> Yiran,
>>> Can you share the logs? It would help us identifying the actual cause of
>>> this failure much faster.
>>>
>>> I am pretty sure you know this but in case you didn't, you can get the
>>> logs using
>>> >managix log -n <instance-name>
>>>
>>> Also, it would be nice if someone from the team has access to the
>>> cluster so we can work with it directly.
>>> Cheers,
>>> Abdullah.
>>>
>>>
>>> On Fri, Feb 19, 2016 at 9:40 AM, Yiran Wang <wy...@gmail.com> wrote:
>>>
>>>> Steven,
>>>>
>>>> Thanks for getting back to me so quickly! I wasn't clear. Here is what
>>>> happened:
>>>>
>>>> I test-loaded the first 32 files, no problem. I deleted the dataset,
>>>> created a new one, and tried to load the entire 76 files into the newly
>>>> created (hence empty) dataset.
>>>>
>>>> It took about 2mins after executing the query for the error message to
>>>> show up. There are currently 31710406 rows of data in the dataset, despite
>>>> the error message (so it looks like it did load).
>>>>
>>>> So my questions are: 1) why did I still get that error message when I
>>>> was loading to an empty dataset; and 2) I'm not sure if all the data from
>>>> the 76 file are fully loaded. Is there other ways to check, besides trying
>>>> to load it again and hope this time I don't get the error?
>>>>
>>>> Thanks!
>>>> Yiran
>>>>
>>>> On Thu, Feb 18, 2016 at 10:29 PM, Steven Jacobs <sj...@ucr.edu>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>> Welcome! We are an Apache incubator project now so I added the correct
>>>>> mailing list. Our "load" statement only works on an empty dataset.
>>>>> Subsequent data needs to be added with an insert or a feed. You should be
>>>>> able to load all 76 files at once though (starting from empty).
>>>>> Steven
>>>>>
>>>>>
>>>>> On Thursday, February 18, 2016, Yiran Wang <wy...@gmail.com> wrote:
>>>>>
>>>>>> Hi Asterix team!
>>>>>>
>>>>>> I've come across this error when I was trying to load 76 files into a
>>>>>> dataset. When I test-loaded the first 32 files, there wasn't such an error.
>>>>>> All 76 files are of the same data format.
>>>>>>
>>>>>> Can you help interpret what this error message means?
>>>>>>
>>>>>> Thanks!
>>>>>> Yiran
>>>>>>
>>>>>> --
>>>>>> Best,
>>>>>> Yiran
>>>>>>
>>>>>> --
>>>>>> You received this message because you are subscribed to the Google
>>>>>> Groups "asterixdb-dev" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>> send an email to asterixdb-dev+unsubscribe@googlegroups.com.
>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>
>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "asterixdb-users" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to asterixdb-users+unsubscribe@googlegroups.com.
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Best,
>>>> Yiran
>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "asterixdb-dev" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to asterixdb-dev+unsubscribe@googlegroups.com.
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "asterixdb-dev" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to asterixdb-dev+unsubscribe@googlegroups.com.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "asterixdb-users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to asterixdb-users+unsubscribe@googlegroups.com.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
>
> --
> Best,
> Yiran
>
> --
> You received this message because you are subscribed to the Google Groups
> "asterixdb-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to asterixdb-users+unsubscribe@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

Re: Cannot load an index that is not empty [TreeIndexException]

Posted by Wail Alkowaileet <wa...@gmail.com>.
I had this issue before ... I tried to load 23K files (I know it's
ridiculous) and the job failed. But when coalescing them to 1000 files, the
load worked just fine. It would be nice also to do repartitioning when
loading one large file to speedup the parsing process.

I remember similar issue appeared in Spark:
https://github.com/mesos/spark/pull/718

On Sun, Feb 21, 2016 at 8:52 AM, Till Westmann <ti...@apache.org> wrote:

> Sounds like a good candidate for a JIRA issue, so we won't forget. :)
>
> Cheers,
> Till
>
> > On Feb 20, 2016, at 21:44, abdullah alamoudi <ba...@gmail.com> wrote:
> >
> > Totally agree. Probably better make sure it works nicely with that many
> > tasks and then fix the number of readers.
> >
> > Cheers,
> > Abdullah.
> >
> >> On Sun, Feb 21, 2016 at 2:04 AM, Mike Carey <dt...@gmail.com> wrote:
> >>
> >> Sounds like the load job parallelism needs a redo - it probably
> shouldn't
> >> be more than the number of target partitions IMO...?
> >>> On Feb 20, 2016 12:41 PM, "abdullah alamoudi" <ba...@gmail.com>
> wrote:
> >>>
> >>> I have an idea that might explain why such a strange behavior
> happened. I
> >>> believe it could be due to the number of task partitions being very
> high
> >>> assuming each of the 76 files is being read in a separate task.
> >>> This could potentially lead to some corner cases that we didn't
> consider
> >>> before considering the number of threads in the tasks thread pool is
> less
> >>> than 76, some tasks will not be able to start until others have
> completed
> >>> execution.
> >>>
> >>> Just a thought,
> >>> Abdullah.
> >>>
> >>> On Fri, Feb 19, 2016 at 9:43 PM, abdullah alamoudi <bamousaa@gmail.com
> >
> >>> wrote:
> >>>
> >>>> Yiran,
> >>>> Here is one problem causing a failure:
> >>>> edu.uci.ics.hyracks.api.exceptions.HyracksDataException:
> >>>> edu.uci.ics.hyracks.api.exceptions.HyracksDataException:
> >>
> edu.uci.ics.hyracks.storage.am.common.exceptions.TreeIndexDuplicateKeyException:
> >>>> Input stream given to BTree bulk load has duplicates.
> >>>>
> >>>> which tells us that Input stream given to BTree bulk load has
> >> duplicates.
> >>>> The question is why this was not returned as the error message? We
> need
> >>> to
> >>>> look into that.
> >>>>
> >>>> I will continue looking at the log file to see if there were other
> >>> issues.
> >>>>
> >>>> Can you share with us the load statement you're using? I would like to
> >>> see
> >>>> how you're loading all the files. we might be able to suggest a way to
> >>> make
> >>>> it work better.
> >>>>
> >>>> Cheers,
> >>>> Abdullah.
> >>>>
> >>>>> On Fri, Feb 19, 2016 at 9:31 PM, Yiran Wang <wy...@gmail.com>
> wrote:
> >>>>>
> >>>>> Abdullah,
> >>>>>
> >>>>> Here is the log attached. Thank you all very much for looking into
> >> this.
> >>>>>
> >>>>> Ian - I have two query questions besides this loading issue. I was
> >>>>> wondering if I can meet briefly with you (or over email) regarding
> >> that.
> >>>>>
> >>>>> Thanks!
> >>>>> Yiran
> >>>>>
> >>>>> On Fri, Feb 19, 2016 at 9:38 AM, Mike Carey <dt...@gmail.com>
> >> wrote:
> >>>>>
> >>>>>> Maybe Ian can visit the cluster with Yiran later today?
> >>>>>> On Feb 19, 2016 1:31 AM, "abdullah alamoudi" <ba...@gmail.com>
> >>> wrote:
> >>>>>>
> >>>>>>> Yiran,
> >>>>>>> Can you share the logs? It would help us identifying the actual
> >> cause
> >>>>>>> of this failure much faster.
> >>>>>>>
> >>>>>>> I am pretty sure you know this but in case you didn't, you can get
> >> the
> >>>>>>> logs using
> >>>>>>>> managix log -n <instance-name>
> >>>>>>>
> >>>>>>> Also, it would be nice if someone from the team has access to the
> >>>>>>> cluster so we can work with it directly.
> >>>>>>> Cheers,
> >>>>>>> Abdullah.
> >>>>>>>
> >>>>>>>
> >>>>>>> On Fri, Feb 19, 2016 at 9:40 AM, Yiran Wang <wy...@gmail.com>
> >>> wrote:
> >>>>>>>
> >>>>>>>> Steven,
> >>>>>>>>
> >>>>>>>> Thanks for getting back to me so quickly! I wasn't clear. Here is
> >>> what
> >>>>>>>> happened:
> >>>>>>>>
> >>>>>>>> I test-loaded the first 32 files, no problem. I deleted the
> >> dataset,
> >>>>>>>> created a new one, and tried to load the entire 76 files into the
> >>> newly
> >>>>>>>> created (hence empty) dataset.
> >>>>>>>>
> >>>>>>>> It took about 2mins after executing the query for the error
> message
> >>> to
> >>>>>>>> show up. There are currently 31710406 rows of data in the dataset,
> >>> despite
> >>>>>>>> the error message (so it looks like it did load).
> >>>>>>>>
> >>>>>>>> So my questions are: 1) why did I still get that error message
> >> when I
> >>>>>>>> was loading to an empty dataset; and 2) I'm not sure if all the
> >> data
> >>> from
> >>>>>>>> the 76 file are fully loaded. Is there other ways to check,
> besides
> >>> trying
> >>>>>>>> to load it again and hope this time I don't get the error?
> >>>>>>>>
> >>>>>>>> Thanks!
> >>>>>>>> Yiran
> >>>>>>>>
> >>>>>>>> On Thu, Feb 18, 2016 at 10:29 PM, Steven Jacobs <sjaco002@ucr.edu
> >
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Hi,
> >>>>>>>>> Welcome! We are an Apache incubator project now so I added the
> >>>>>>>>> correct mailing list. Our "load" statement only works on an empty
> >>> dataset.
> >>>>>>>>> Subsequent data needs to be added with an insert or a feed. You
> >>> should be
> >>>>>>>>> able to load all 76 files at once though (starting from empty).
> >>>>>>>>> Steven
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Thursday, February 18, 2016, Yiran Wang <wy...@gmail.com>
> >>> wrote:
> >>>>>>>>>
> >>>>>>>>>> Hi Asterix team!
> >>>>>>>>>>
> >>>>>>>>>> I've come across this error when I was trying to load 76 files
> >> into
> >>>>>>>>>> a dataset. When I test-loaded the first 32 files, there wasn't
> >>> such an
> >>>>>>>>>> error. All 76 files are of the same data format.
> >>>>>>>>>>
> >>>>>>>>>> Can you help interpret what this error message means?
> >>>>>>>>>>
> >>>>>>>>>> Thanks!
> >>>>>>>>>> Yiran
> >>>>>>>>>>
> >>>>>>>>>> --
> >>>>>>>>>> Best,
> >>>>>>>>>> Yiran
> >>>>>>>>>>
> >>>>>>>>>> --
> >>>>>>>>>> You received this message because you are subscribed to the
> >> Google
> >>>>>>>>>> Groups "asterixdb-dev" group.
> >>>>>>>>>> To unsubscribe from this group and stop receiving emails from
> it,
> >>>>>>>>>> send an email to asterixdb-dev+unsubscribe@googlegroups.com.
> >>>>>>>>>> For more options, visit https://groups.google.com/d/optout.
> >>>>>>>>> --
> >>>>>>>>> You received this message because you are subscribed to the
> Google
> >>>>>>>>> Groups "asterixdb-users" group.
> >>>>>>>>> To unsubscribe from this group and stop receiving emails from it,
> >>>>>>>>> send an email to asterixdb-users+unsubscribe@googlegroups.com.
> >>>>>>>>> For more options, visit https://groups.google.com/d/optout.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>> Best,
> >>>>>>>> Yiran
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>> You received this message because you are subscribed to the Google
> >>>>>>>> Groups "asterixdb-dev" group.
> >>>>>>>> To unsubscribe from this group and stop receiving emails from it,
> >>> send
> >>>>>>>> an email to asterixdb-dev+unsubscribe@googlegroups.com.
> >>>>>>>> For more options, visit https://groups.google.com/d/optout.
> >>>>>>>
> >>>>>>> --
> >>>>>>> You received this message because you are subscribed to the Google
> >>>>>>> Groups "asterixdb-dev" group.
> >>>>>>> To unsubscribe from this group and stop receiving emails from it,
> >> send
> >>>>>>> an email to asterixdb-dev+unsubscribe@googlegroups.com.
> >>>>>>> For more options, visit https://groups.google.com/d/optout.
> >>>>>> --
> >>>>>> You received this message because you are subscribed to the Google
> >>>>>> Groups "asterixdb-users" group.
> >>>>>> To unsubscribe from this group and stop receiving emails from it,
> >> send
> >>>>>> an email to asterixdb-users+unsubscribe@googlegroups.com.
> >>>>>> For more options, visit https://groups.google.com/d/optout.
> >>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> Best,
> >>>>> Yiran
> >>>>>
> >>>>> --
> >>>>> You received this message because you are subscribed to the Google
> >>> Groups
> >>>>> "asterixdb-dev" group.
> >>>>> To unsubscribe from this group and stop receiving emails from it,
> send
> >>> an
> >>>>> email to asterixdb-dev+unsubscribe@googlegroups.com.
> >>>>> For more options, visit https://groups.google.com/d/optout.
> >>
>



-- 

*Regards,*
Wail Alkowaileet

Re: Cannot load an index that is not empty [TreeIndexException]

Posted by Yingyi Bu <bu...@gmail.com>.
>> Sounds like the load job parallelism needs a redo - it probably shouldn't
>> be more than the number of target partitions IMO...?

Yes, we've already done that for loading data from HDFS, but haven't done
that
for loading data from LocalFS.


Best,
Yingyi


On Sat, Feb 20, 2016 at 10:23 PM, abdullah alamoudi <ba...@gmail.com>
wrote:

> Done.
>
> Abdullah.
>
> On Sun, Feb 21, 2016 at 8:52 AM, Till Westmann <ti...@apache.org> wrote:
>
> > Sounds like a good candidate for a JIRA issue, so we won't forget. :)
> >
> > Cheers,
> > Till
> >
> > > On Feb 20, 2016, at 21:44, abdullah alamoudi <ba...@gmail.com>
> wrote:
> > >
> > > Totally agree. Probably better make sure it works nicely with that many
> > > tasks and then fix the number of readers.
> > >
> > > Cheers,
> > > Abdullah.
> > >
> > >> On Sun, Feb 21, 2016 at 2:04 AM, Mike Carey <dt...@gmail.com>
> wrote:
> > >>
> > >> Sounds like the load job parallelism needs a redo - it probably
> > shouldn't
> > >> be more than the number of target partitions IMO...?
> > >>> On Feb 20, 2016 12:41 PM, "abdullah alamoudi" <ba...@gmail.com>
> > wrote:
> > >>>
> > >>> I have an idea that might explain why such a strange behavior
> > happened. I
> > >>> believe it could be due to the number of task partitions being very
> > high
> > >>> assuming each of the 76 files is being read in a separate task.
> > >>> This could potentially lead to some corner cases that we didn't
> > consider
> > >>> before considering the number of threads in the tasks thread pool is
> > less
> > >>> than 76, some tasks will not be able to start until others have
> > completed
> > >>> execution.
> > >>>
> > >>> Just a thought,
> > >>> Abdullah.
> > >>>
> > >>> On Fri, Feb 19, 2016 at 9:43 PM, abdullah alamoudi <
> bamousaa@gmail.com
> > >
> > >>> wrote:
> > >>>
> > >>>> Yiran,
> > >>>> Here is one problem causing a failure:
> > >>>> edu.uci.ics.hyracks.api.exceptions.HyracksDataException:
> > >>>> edu.uci.ics.hyracks.api.exceptions.HyracksDataException:
> > >>
> >
> edu.uci.ics.hyracks.storage.am.common.exceptions.TreeIndexDuplicateKeyException:
> > >>>> Input stream given to BTree bulk load has duplicates.
> > >>>>
> > >>>> which tells us that Input stream given to BTree bulk load has
> > >> duplicates.
> > >>>> The question is why this was not returned as the error message? We
> > need
> > >>> to
> > >>>> look into that.
> > >>>>
> > >>>> I will continue looking at the log file to see if there were other
> > >>> issues.
> > >>>>
> > >>>> Can you share with us the load statement you're using? I would like
> to
> > >>> see
> > >>>> how you're loading all the files. we might be able to suggest a way
> to
> > >>> make
> > >>>> it work better.
> > >>>>
> > >>>> Cheers,
> > >>>> Abdullah.
> > >>>>
> > >>>>> On Fri, Feb 19, 2016 at 9:31 PM, Yiran Wang <wy...@gmail.com>
> > wrote:
> > >>>>>
> > >>>>> Abdullah,
> > >>>>>
> > >>>>> Here is the log attached. Thank you all very much for looking into
> > >> this.
> > >>>>>
> > >>>>> Ian - I have two query questions besides this loading issue. I was
> > >>>>> wondering if I can meet briefly with you (or over email) regarding
> > >> that.
> > >>>>>
> > >>>>> Thanks!
> > >>>>> Yiran
> > >>>>>
> > >>>>> On Fri, Feb 19, 2016 at 9:38 AM, Mike Carey <dt...@gmail.com>
> > >> wrote:
> > >>>>>
> > >>>>>> Maybe Ian can visit the cluster with Yiran later today?
> > >>>>>> On Feb 19, 2016 1:31 AM, "abdullah alamoudi" <ba...@gmail.com>
> > >>> wrote:
> > >>>>>>
> > >>>>>>> Yiran,
> > >>>>>>> Can you share the logs? It would help us identifying the actual
> > >> cause
> > >>>>>>> of this failure much faster.
> > >>>>>>>
> > >>>>>>> I am pretty sure you know this but in case you didn't, you can
> get
> > >> the
> > >>>>>>> logs using
> > >>>>>>>> managix log -n <instance-name>
> > >>>>>>>
> > >>>>>>> Also, it would be nice if someone from the team has access to the
> > >>>>>>> cluster so we can work with it directly.
> > >>>>>>> Cheers,
> > >>>>>>> Abdullah.
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> On Fri, Feb 19, 2016 at 9:40 AM, Yiran Wang <wy...@gmail.com>
> > >>> wrote:
> > >>>>>>>
> > >>>>>>>> Steven,
> > >>>>>>>>
> > >>>>>>>> Thanks for getting back to me so quickly! I wasn't clear. Here
> is
> > >>> what
> > >>>>>>>> happened:
> > >>>>>>>>
> > >>>>>>>> I test-loaded the first 32 files, no problem. I deleted the
> > >> dataset,
> > >>>>>>>> created a new one, and tried to load the entire 76 files into
> the
> > >>> newly
> > >>>>>>>> created (hence empty) dataset.
> > >>>>>>>>
> > >>>>>>>> It took about 2mins after executing the query for the error
> > message
> > >>> to
> > >>>>>>>> show up. There are currently 31710406 rows of data in the
> dataset,
> > >>> despite
> > >>>>>>>> the error message (so it looks like it did load).
> > >>>>>>>>
> > >>>>>>>> So my questions are: 1) why did I still get that error message
> > >> when I
> > >>>>>>>> was loading to an empty dataset; and 2) I'm not sure if all the
> > >> data
> > >>> from
> > >>>>>>>> the 76 file are fully loaded. Is there other ways to check,
> > besides
> > >>> trying
> > >>>>>>>> to load it again and hope this time I don't get the error?
> > >>>>>>>>
> > >>>>>>>> Thanks!
> > >>>>>>>> Yiran
> > >>>>>>>>
> > >>>>>>>> On Thu, Feb 18, 2016 at 10:29 PM, Steven Jacobs <
> sjaco002@ucr.edu
> > >
> > >>>>>>>> wrote:
> > >>>>>>>>
> > >>>>>>>>> Hi,
> > >>>>>>>>> Welcome! We are an Apache incubator project now so I added the
> > >>>>>>>>> correct mailing list. Our "load" statement only works on an
> empty
> > >>> dataset.
> > >>>>>>>>> Subsequent data needs to be added with an insert or a feed. You
> > >>> should be
> > >>>>>>>>> able to load all 76 files at once though (starting from empty).
> > >>>>>>>>> Steven
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> On Thursday, February 18, 2016, Yiran Wang <wy...@gmail.com>
> > >>> wrote:
> > >>>>>>>>>
> > >>>>>>>>>> Hi Asterix team!
> > >>>>>>>>>>
> > >>>>>>>>>> I've come across this error when I was trying to load 76 files
> > >> into
> > >>>>>>>>>> a dataset. When I test-loaded the first 32 files, there wasn't
> > >>> such an
> > >>>>>>>>>> error. All 76 files are of the same data format.
> > >>>>>>>>>>
> > >>>>>>>>>> Can you help interpret what this error message means?
> > >>>>>>>>>>
> > >>>>>>>>>> Thanks!
> > >>>>>>>>>> Yiran
> > >>>>>>>>>>
> > >>>>>>>>>> --
> > >>>>>>>>>> Best,
> > >>>>>>>>>> Yiran
> > >>>>>>>>>>
> > >>>>>>>>>> --
> > >>>>>>>>>> You received this message because you are subscribed to the
> > >> Google
> > >>>>>>>>>> Groups "asterixdb-dev" group.
> > >>>>>>>>>> To unsubscribe from this group and stop receiving emails from
> > it,
> > >>>>>>>>>> send an email to asterixdb-dev+unsubscribe@googlegroups.com.
> > >>>>>>>>>> For more options, visit https://groups.google.com/d/optout.
> > >>>>>>>>> --
> > >>>>>>>>> You received this message because you are subscribed to the
> > Google
> > >>>>>>>>> Groups "asterixdb-users" group.
> > >>>>>>>>> To unsubscribe from this group and stop receiving emails from
> it,
> > >>>>>>>>> send an email to asterixdb-users+unsubscribe@googlegroups.com.
> > >>>>>>>>> For more options, visit https://groups.google.com/d/optout.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> --
> > >>>>>>>> Best,
> > >>>>>>>> Yiran
> > >>>>>>>>
> > >>>>>>>> --
> > >>>>>>>> You received this message because you are subscribed to the
> Google
> > >>>>>>>> Groups "asterixdb-dev" group.
> > >>>>>>>> To unsubscribe from this group and stop receiving emails from
> it,
> > >>> send
> > >>>>>>>> an email to asterixdb-dev+unsubscribe@googlegroups.com.
> > >>>>>>>> For more options, visit https://groups.google.com/d/optout.
> > >>>>>>>
> > >>>>>>> --
> > >>>>>>> You received this message because you are subscribed to the
> Google
> > >>>>>>> Groups "asterixdb-dev" group.
> > >>>>>>> To unsubscribe from this group and stop receiving emails from it,
> > >> send
> > >>>>>>> an email to asterixdb-dev+unsubscribe@googlegroups.com.
> > >>>>>>> For more options, visit https://groups.google.com/d/optout.
> > >>>>>> --
> > >>>>>> You received this message because you are subscribed to the Google
> > >>>>>> Groups "asterixdb-users" group.
> > >>>>>> To unsubscribe from this group and stop receiving emails from it,
> > >> send
> > >>>>>> an email to asterixdb-users+unsubscribe@googlegroups.com.
> > >>>>>> For more options, visit https://groups.google.com/d/optout.
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>> --
> > >>>>> Best,
> > >>>>> Yiran
> > >>>>>
> > >>>>> --
> > >>>>> You received this message because you are subscribed to the Google
> > >>> Groups
> > >>>>> "asterixdb-dev" group.
> > >>>>> To unsubscribe from this group and stop receiving emails from it,
> > send
> > >>> an
> > >>>>> email to asterixdb-dev+unsubscribe@googlegroups.com.
> > >>>>> For more options, visit https://groups.google.com/d/optout.
> > >>
> >
>

Re: Cannot load an index that is not empty [TreeIndexException]

Posted by abdullah alamoudi <ba...@gmail.com>.
Done.

Abdullah.

On Sun, Feb 21, 2016 at 8:52 AM, Till Westmann <ti...@apache.org> wrote:

> Sounds like a good candidate for a JIRA issue, so we won't forget. :)
>
> Cheers,
> Till
>
> > On Feb 20, 2016, at 21:44, abdullah alamoudi <ba...@gmail.com> wrote:
> >
> > Totally agree. Probably better make sure it works nicely with that many
> > tasks and then fix the number of readers.
> >
> > Cheers,
> > Abdullah.
> >
> >> On Sun, Feb 21, 2016 at 2:04 AM, Mike Carey <dt...@gmail.com> wrote:
> >>
> >> Sounds like the load job parallelism needs a redo - it probably
> shouldn't
> >> be more than the number of target partitions IMO...?
> >>> On Feb 20, 2016 12:41 PM, "abdullah alamoudi" <ba...@gmail.com>
> wrote:
> >>>
> >>> I have an idea that might explain why such a strange behavior
> happened. I
> >>> believe it could be due to the number of task partitions being very
> high
> >>> assuming each of the 76 files is being read in a separate task.
> >>> This could potentially lead to some corner cases that we didn't
> consider
> >>> before considering the number of threads in the tasks thread pool is
> less
> >>> than 76, some tasks will not be able to start until others have
> completed
> >>> execution.
> >>>
> >>> Just a thought,
> >>> Abdullah.
> >>>
> >>> On Fri, Feb 19, 2016 at 9:43 PM, abdullah alamoudi <bamousaa@gmail.com
> >
> >>> wrote:
> >>>
> >>>> Yiran,
> >>>> Here is one problem causing a failure:
> >>>> edu.uci.ics.hyracks.api.exceptions.HyracksDataException:
> >>>> edu.uci.ics.hyracks.api.exceptions.HyracksDataException:
> >>
> edu.uci.ics.hyracks.storage.am.common.exceptions.TreeIndexDuplicateKeyException:
> >>>> Input stream given to BTree bulk load has duplicates.
> >>>>
> >>>> which tells us that Input stream given to BTree bulk load has
> >> duplicates.
> >>>> The question is why this was not returned as the error message? We
> need
> >>> to
> >>>> look into that.
> >>>>
> >>>> I will continue looking at the log file to see if there were other
> >>> issues.
> >>>>
> >>>> Can you share with us the load statement you're using? I would like to
> >>> see
> >>>> how you're loading all the files. we might be able to suggest a way to
> >>> make
> >>>> it work better.
> >>>>
> >>>> Cheers,
> >>>> Abdullah.
> >>>>
> >>>>> On Fri, Feb 19, 2016 at 9:31 PM, Yiran Wang <wy...@gmail.com>
> wrote:
> >>>>>
> >>>>> Abdullah,
> >>>>>
> >>>>> Here is the log attached. Thank you all very much for looking into
> >> this.
> >>>>>
> >>>>> Ian - I have two query questions besides this loading issue. I was
> >>>>> wondering if I can meet briefly with you (or over email) regarding
> >> that.
> >>>>>
> >>>>> Thanks!
> >>>>> Yiran
> >>>>>
> >>>>> On Fri, Feb 19, 2016 at 9:38 AM, Mike Carey <dt...@gmail.com>
> >> wrote:
> >>>>>
> >>>>>> Maybe Ian can visit the cluster with Yiran later today?
> >>>>>> On Feb 19, 2016 1:31 AM, "abdullah alamoudi" <ba...@gmail.com>
> >>> wrote:
> >>>>>>
> >>>>>>> Yiran,
> >>>>>>> Can you share the logs? It would help us identifying the actual
> >> cause
> >>>>>>> of this failure much faster.
> >>>>>>>
> >>>>>>> I am pretty sure you know this but in case you didn't, you can get
> >> the
> >>>>>>> logs using
> >>>>>>>> managix log -n <instance-name>
> >>>>>>>
> >>>>>>> Also, it would be nice if someone from the team has access to the
> >>>>>>> cluster so we can work with it directly.
> >>>>>>> Cheers,
> >>>>>>> Abdullah.
> >>>>>>>
> >>>>>>>
> >>>>>>> On Fri, Feb 19, 2016 at 9:40 AM, Yiran Wang <wy...@gmail.com>
> >>> wrote:
> >>>>>>>
> >>>>>>>> Steven,
> >>>>>>>>
> >>>>>>>> Thanks for getting back to me so quickly! I wasn't clear. Here is
> >>> what
> >>>>>>>> happened:
> >>>>>>>>
> >>>>>>>> I test-loaded the first 32 files, no problem. I deleted the
> >> dataset,
> >>>>>>>> created a new one, and tried to load the entire 76 files into the
> >>> newly
> >>>>>>>> created (hence empty) dataset.
> >>>>>>>>
> >>>>>>>> It took about 2mins after executing the query for the error
> message
> >>> to
> >>>>>>>> show up. There are currently 31710406 rows of data in the dataset,
> >>> despite
> >>>>>>>> the error message (so it looks like it did load).
> >>>>>>>>
> >>>>>>>> So my questions are: 1) why did I still get that error message
> >> when I
> >>>>>>>> was loading to an empty dataset; and 2) I'm not sure if all the
> >> data
> >>> from
> >>>>>>>> the 76 file are fully loaded. Is there other ways to check,
> besides
> >>> trying
> >>>>>>>> to load it again and hope this time I don't get the error?
> >>>>>>>>
> >>>>>>>> Thanks!
> >>>>>>>> Yiran
> >>>>>>>>
> >>>>>>>> On Thu, Feb 18, 2016 at 10:29 PM, Steven Jacobs <sjaco002@ucr.edu
> >
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Hi,
> >>>>>>>>> Welcome! We are an Apache incubator project now so I added the
> >>>>>>>>> correct mailing list. Our "load" statement only works on an empty
> >>> dataset.
> >>>>>>>>> Subsequent data needs to be added with an insert or a feed. You
> >>> should be
> >>>>>>>>> able to load all 76 files at once though (starting from empty).
> >>>>>>>>> Steven
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Thursday, February 18, 2016, Yiran Wang <wy...@gmail.com>
> >>> wrote:
> >>>>>>>>>
> >>>>>>>>>> Hi Asterix team!
> >>>>>>>>>>
> >>>>>>>>>> I've come across this error when I was trying to load 76 files
> >> into
> >>>>>>>>>> a dataset. When I test-loaded the first 32 files, there wasn't
> >>> such an
> >>>>>>>>>> error. All 76 files are of the same data format.
> >>>>>>>>>>
> >>>>>>>>>> Can you help interpret what this error message means?
> >>>>>>>>>>
> >>>>>>>>>> Thanks!
> >>>>>>>>>> Yiran
> >>>>>>>>>>
> >>>>>>>>>> --
> >>>>>>>>>> Best,
> >>>>>>>>>> Yiran
> >>>>>>>>>>
> >>>>>>>>>> --
> >>>>>>>>>> You received this message because you are subscribed to the
> >> Google
> >>>>>>>>>> Groups "asterixdb-dev" group.
> >>>>>>>>>> To unsubscribe from this group and stop receiving emails from
> it,
> >>>>>>>>>> send an email to asterixdb-dev+unsubscribe@googlegroups.com.
> >>>>>>>>>> For more options, visit https://groups.google.com/d/optout.
> >>>>>>>>> --
> >>>>>>>>> You received this message because you are subscribed to the
> Google
> >>>>>>>>> Groups "asterixdb-users" group.
> >>>>>>>>> To unsubscribe from this group and stop receiving emails from it,
> >>>>>>>>> send an email to asterixdb-users+unsubscribe@googlegroups.com.
> >>>>>>>>> For more options, visit https://groups.google.com/d/optout.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>> Best,
> >>>>>>>> Yiran
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>> You received this message because you are subscribed to the Google
> >>>>>>>> Groups "asterixdb-dev" group.
> >>>>>>>> To unsubscribe from this group and stop receiving emails from it,
> >>> send
> >>>>>>>> an email to asterixdb-dev+unsubscribe@googlegroups.com.
> >>>>>>>> For more options, visit https://groups.google.com/d/optout.
> >>>>>>>
> >>>>>>> --
> >>>>>>> You received this message because you are subscribed to the Google
> >>>>>>> Groups "asterixdb-dev" group.
> >>>>>>> To unsubscribe from this group and stop receiving emails from it,
> >> send
> >>>>>>> an email to asterixdb-dev+unsubscribe@googlegroups.com.
> >>>>>>> For more options, visit https://groups.google.com/d/optout.
> >>>>>> --
> >>>>>> You received this message because you are subscribed to the Google
> >>>>>> Groups "asterixdb-users" group.
> >>>>>> To unsubscribe from this group and stop receiving emails from it,
> >> send
> >>>>>> an email to asterixdb-users+unsubscribe@googlegroups.com.
> >>>>>> For more options, visit https://groups.google.com/d/optout.
> >>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> Best,
> >>>>> Yiran
> >>>>>
> >>>>> --
> >>>>> You received this message because you are subscribed to the Google
> >>> Groups
> >>>>> "asterixdb-dev" group.
> >>>>> To unsubscribe from this group and stop receiving emails from it,
> send
> >>> an
> >>>>> email to asterixdb-dev+unsubscribe@googlegroups.com.
> >>>>> For more options, visit https://groups.google.com/d/optout.
> >>
>

Re: Cannot load an index that is not empty [TreeIndexException]

Posted by Till Westmann <ti...@apache.org>.
Sounds like a good candidate for a JIRA issue, so we won't forget. :)

Cheers,
Till

> On Feb 20, 2016, at 21:44, abdullah alamoudi <ba...@gmail.com> wrote:
> 
> Totally agree. Probably better make sure it works nicely with that many
> tasks and then fix the number of readers.
> 
> Cheers,
> Abdullah.
> 
>> On Sun, Feb 21, 2016 at 2:04 AM, Mike Carey <dt...@gmail.com> wrote:
>> 
>> Sounds like the load job parallelism needs a redo - it probably shouldn't
>> be more than the number of target partitions IMO...?
>>> On Feb 20, 2016 12:41 PM, "abdullah alamoudi" <ba...@gmail.com> wrote:
>>> 
>>> I have an idea that might explain why such a strange behavior happened. I
>>> believe it could be due to the number of task partitions being very high
>>> assuming each of the 76 files is being read in a separate task.
>>> This could potentially lead to some corner cases that we didn't consider
>>> before considering the number of threads in the tasks thread pool is less
>>> than 76, some tasks will not be able to start until others have completed
>>> execution.
>>> 
>>> Just a thought,
>>> Abdullah.
>>> 
>>> On Fri, Feb 19, 2016 at 9:43 PM, abdullah alamoudi <ba...@gmail.com>
>>> wrote:
>>> 
>>>> Yiran,
>>>> Here is one problem causing a failure:
>>>> edu.uci.ics.hyracks.api.exceptions.HyracksDataException:
>>>> edu.uci.ics.hyracks.api.exceptions.HyracksDataException:
>> edu.uci.ics.hyracks.storage.am.common.exceptions.TreeIndexDuplicateKeyException:
>>>> Input stream given to BTree bulk load has duplicates.
>>>> 
>>>> which tells us that Input stream given to BTree bulk load has
>> duplicates.
>>>> The question is why this was not returned as the error message? We need
>>> to
>>>> look into that.
>>>> 
>>>> I will continue looking at the log file to see if there were other
>>> issues.
>>>> 
>>>> Can you share with us the load statement you're using? I would like to
>>> see
>>>> how you're loading all the files. we might be able to suggest a way to
>>> make
>>>> it work better.
>>>> 
>>>> Cheers,
>>>> Abdullah.
>>>> 
>>>>> On Fri, Feb 19, 2016 at 9:31 PM, Yiran Wang <wy...@gmail.com> wrote:
>>>>> 
>>>>> Abdullah,
>>>>> 
>>>>> Here is the log attached. Thank you all very much for looking into
>> this.
>>>>> 
>>>>> Ian - I have two query questions besides this loading issue. I was
>>>>> wondering if I can meet briefly with you (or over email) regarding
>> that.
>>>>> 
>>>>> Thanks!
>>>>> Yiran
>>>>> 
>>>>> On Fri, Feb 19, 2016 at 9:38 AM, Mike Carey <dt...@gmail.com>
>> wrote:
>>>>> 
>>>>>> Maybe Ian can visit the cluster with Yiran later today?
>>>>>> On Feb 19, 2016 1:31 AM, "abdullah alamoudi" <ba...@gmail.com>
>>> wrote:
>>>>>> 
>>>>>>> Yiran,
>>>>>>> Can you share the logs? It would help us identifying the actual
>> cause
>>>>>>> of this failure much faster.
>>>>>>> 
>>>>>>> I am pretty sure you know this but in case you didn't, you can get
>> the
>>>>>>> logs using
>>>>>>>> managix log -n <instance-name>
>>>>>>> 
>>>>>>> Also, it would be nice if someone from the team has access to the
>>>>>>> cluster so we can work with it directly.
>>>>>>> Cheers,
>>>>>>> Abdullah.
>>>>>>> 
>>>>>>> 
>>>>>>> On Fri, Feb 19, 2016 at 9:40 AM, Yiran Wang <wy...@gmail.com>
>>> wrote:
>>>>>>> 
>>>>>>>> Steven,
>>>>>>>> 
>>>>>>>> Thanks for getting back to me so quickly! I wasn't clear. Here is
>>> what
>>>>>>>> happened:
>>>>>>>> 
>>>>>>>> I test-loaded the first 32 files, no problem. I deleted the
>> dataset,
>>>>>>>> created a new one, and tried to load the entire 76 files into the
>>> newly
>>>>>>>> created (hence empty) dataset.
>>>>>>>> 
>>>>>>>> It took about 2mins after executing the query for the error message
>>> to
>>>>>>>> show up. There are currently 31710406 rows of data in the dataset,
>>> despite
>>>>>>>> the error message (so it looks like it did load).
>>>>>>>> 
>>>>>>>> So my questions are: 1) why did I still get that error message
>> when I
>>>>>>>> was loading to an empty dataset; and 2) I'm not sure if all the
>> data
>>> from
>>>>>>>> the 76 file are fully loaded. Is there other ways to check, besides
>>> trying
>>>>>>>> to load it again and hope this time I don't get the error?
>>>>>>>> 
>>>>>>>> Thanks!
>>>>>>>> Yiran
>>>>>>>> 
>>>>>>>> On Thu, Feb 18, 2016 at 10:29 PM, Steven Jacobs <sj...@ucr.edu>
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Hi,
>>>>>>>>> Welcome! We are an Apache incubator project now so I added the
>>>>>>>>> correct mailing list. Our "load" statement only works on an empty
>>> dataset.
>>>>>>>>> Subsequent data needs to be added with an insert or a feed. You
>>> should be
>>>>>>>>> able to load all 76 files at once though (starting from empty).
>>>>>>>>> Steven
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Thursday, February 18, 2016, Yiran Wang <wy...@gmail.com>
>>> wrote:
>>>>>>>>> 
>>>>>>>>>> Hi Asterix team!
>>>>>>>>>> 
>>>>>>>>>> I've come across this error when I was trying to load 76 files
>> into
>>>>>>>>>> a dataset. When I test-loaded the first 32 files, there wasn't
>>> such an
>>>>>>>>>> error. All 76 files are of the same data format.
>>>>>>>>>> 
>>>>>>>>>> Can you help interpret what this error message means?
>>>>>>>>>> 
>>>>>>>>>> Thanks!
>>>>>>>>>> Yiran
>>>>>>>>>> 
>>>>>>>>>> --
>>>>>>>>>> Best,
>>>>>>>>>> Yiran
>>>>>>>>>> 
>>>>>>>>>> --
>>>>>>>>>> You received this message because you are subscribed to the
>> Google
>>>>>>>>>> Groups "asterixdb-dev" group.
>>>>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>>>>> send an email to asterixdb-dev+unsubscribe@googlegroups.com.
>>>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>> --
>>>>>>>>> You received this message because you are subscribed to the Google
>>>>>>>>> Groups "asterixdb-users" group.
>>>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>>>> send an email to asterixdb-users+unsubscribe@googlegroups.com.
>>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> --
>>>>>>>> Best,
>>>>>>>> Yiran
>>>>>>>> 
>>>>>>>> --
>>>>>>>> You received this message because you are subscribed to the Google
>>>>>>>> Groups "asterixdb-dev" group.
>>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>> send
>>>>>>>> an email to asterixdb-dev+unsubscribe@googlegroups.com.
>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>> 
>>>>>>> --
>>>>>>> You received this message because you are subscribed to the Google
>>>>>>> Groups "asterixdb-dev" group.
>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>> send
>>>>>>> an email to asterixdb-dev+unsubscribe@googlegroups.com.
>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>> --
>>>>>> You received this message because you are subscribed to the Google
>>>>>> Groups "asterixdb-users" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it,
>> send
>>>>>> an email to asterixdb-users+unsubscribe@googlegroups.com.
>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> Best,
>>>>> Yiran
>>>>> 
>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>> Groups
>>>>> "asterixdb-dev" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an
>>>>> email to asterixdb-dev+unsubscribe@googlegroups.com.
>>>>> For more options, visit https://groups.google.com/d/optout.
>> 

Re: Cannot load an index that is not empty [TreeIndexException]

Posted by abdullah alamoudi <ba...@gmail.com>.
Totally agree. Probably better make sure it works nicely with that many
tasks and then fix the number of readers.

Cheers,
Abdullah.

On Sun, Feb 21, 2016 at 2:04 AM, Mike Carey <dt...@gmail.com> wrote:

> Sounds like the load job parallelism needs a redo - it probably shouldn't
> be more than the number of target partitions IMO...?
> On Feb 20, 2016 12:41 PM, "abdullah alamoudi" <ba...@gmail.com> wrote:
>
> > I have an idea that might explain why such a strange behavior happened. I
> > believe it could be due to the number of task partitions being very high
> > assuming each of the 76 files is being read in a separate task.
> > This could potentially lead to some corner cases that we didn't consider
> > before considering the number of threads in the tasks thread pool is less
> > than 76, some tasks will not be able to start until others have completed
> > execution.
> >
> > Just a thought,
> > Abdullah.
> >
> > On Fri, Feb 19, 2016 at 9:43 PM, abdullah alamoudi <ba...@gmail.com>
> > wrote:
> >
> > > Yiran,
> > > Here is one problem causing a failure:
> > > edu.uci.ics.hyracks.api.exceptions.HyracksDataException:
> > > edu.uci.ics.hyracks.api.exceptions.HyracksDataException:
> > >
> >
> edu.uci.ics.hyracks.storage.am.common.exceptions.TreeIndexDuplicateKeyException:
> > > Input stream given to BTree bulk load has duplicates.
> > >
> > > which tells us that Input stream given to BTree bulk load has
> duplicates.
> > > The question is why this was not returned as the error message? We need
> > to
> > > look into that.
> > >
> > > I will continue looking at the log file to see if there were other
> > issues.
> > >
> > > Can you share with us the load statement you're using? I would like to
> > see
> > > how you're loading all the files. we might be able to suggest a way to
> > make
> > > it work better.
> > >
> > > Cheers,
> > > Abdullah.
> > >
> > > On Fri, Feb 19, 2016 at 9:31 PM, Yiran Wang <wy...@gmail.com> wrote:
> > >
> > >> Abdullah,
> > >>
> > >> Here is the log attached. Thank you all very much for looking into
> this.
> > >>
> > >> Ian - I have two query questions besides this loading issue. I was
> > >> wondering if I can meet briefly with you (or over email) regarding
> that.
> > >>
> > >> Thanks!
> > >> Yiran
> > >>
> > >> On Fri, Feb 19, 2016 at 9:38 AM, Mike Carey <dt...@gmail.com>
> wrote:
> > >>
> > >>> Maybe Ian can visit the cluster with Yiran later today?
> > >>> On Feb 19, 2016 1:31 AM, "abdullah alamoudi" <ba...@gmail.com>
> > wrote:
> > >>>
> > >>>> Yiran,
> > >>>> Can you share the logs? It would help us identifying the actual
> cause
> > >>>> of this failure much faster.
> > >>>>
> > >>>> I am pretty sure you know this but in case you didn't, you can get
> the
> > >>>> logs using
> > >>>> >managix log -n <instance-name>
> > >>>>
> > >>>> Also, it would be nice if someone from the team has access to the
> > >>>> cluster so we can work with it directly.
> > >>>> Cheers,
> > >>>> Abdullah.
> > >>>>
> > >>>>
> > >>>> On Fri, Feb 19, 2016 at 9:40 AM, Yiran Wang <wy...@gmail.com>
> > wrote:
> > >>>>
> > >>>>> Steven,
> > >>>>>
> > >>>>> Thanks for getting back to me so quickly! I wasn't clear. Here is
> > what
> > >>>>> happened:
> > >>>>>
> > >>>>> I test-loaded the first 32 files, no problem. I deleted the
> dataset,
> > >>>>> created a new one, and tried to load the entire 76 files into the
> > newly
> > >>>>> created (hence empty) dataset.
> > >>>>>
> > >>>>> It took about 2mins after executing the query for the error message
> > to
> > >>>>> show up. There are currently 31710406 rows of data in the dataset,
> > despite
> > >>>>> the error message (so it looks like it did load).
> > >>>>>
> > >>>>> So my questions are: 1) why did I still get that error message
> when I
> > >>>>> was loading to an empty dataset; and 2) I'm not sure if all the
> data
> > from
> > >>>>> the 76 file are fully loaded. Is there other ways to check, besides
> > trying
> > >>>>> to load it again and hope this time I don't get the error?
> > >>>>>
> > >>>>> Thanks!
> > >>>>> Yiran
> > >>>>>
> > >>>>> On Thu, Feb 18, 2016 at 10:29 PM, Steven Jacobs <sj...@ucr.edu>
> > >>>>> wrote:
> > >>>>>
> > >>>>>> Hi,
> > >>>>>> Welcome! We are an Apache incubator project now so I added the
> > >>>>>> correct mailing list. Our "load" statement only works on an empty
> > dataset.
> > >>>>>> Subsequent data needs to be added with an insert or a feed. You
> > should be
> > >>>>>> able to load all 76 files at once though (starting from empty).
> > >>>>>> Steven
> > >>>>>>
> > >>>>>>
> > >>>>>> On Thursday, February 18, 2016, Yiran Wang <wy...@gmail.com>
> > wrote:
> > >>>>>>
> > >>>>>>> Hi Asterix team!
> > >>>>>>>
> > >>>>>>> I've come across this error when I was trying to load 76 files
> into
> > >>>>>>> a dataset. When I test-loaded the first 32 files, there wasn't
> > such an
> > >>>>>>> error. All 76 files are of the same data format.
> > >>>>>>>
> > >>>>>>> Can you help interpret what this error message means?
> > >>>>>>>
> > >>>>>>> Thanks!
> > >>>>>>> Yiran
> > >>>>>>>
> > >>>>>>> --
> > >>>>>>> Best,
> > >>>>>>> Yiran
> > >>>>>>>
> > >>>>>>> --
> > >>>>>>> You received this message because you are subscribed to the
> Google
> > >>>>>>> Groups "asterixdb-dev" group.
> > >>>>>>> To unsubscribe from this group and stop receiving emails from it,
> > >>>>>>> send an email to asterixdb-dev+unsubscribe@googlegroups.com.
> > >>>>>>> For more options, visit https://groups.google.com/d/optout.
> > >>>>>>>
> > >>>>>> --
> > >>>>>> You received this message because you are subscribed to the Google
> > >>>>>> Groups "asterixdb-users" group.
> > >>>>>> To unsubscribe from this group and stop receiving emails from it,
> > >>>>>> send an email to asterixdb-users+unsubscribe@googlegroups.com.
> > >>>>>> For more options, visit https://groups.google.com/d/optout.
> > >>>>>>
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>> --
> > >>>>> Best,
> > >>>>> Yiran
> > >>>>>
> > >>>>> --
> > >>>>> You received this message because you are subscribed to the Google
> > >>>>> Groups "asterixdb-dev" group.
> > >>>>> To unsubscribe from this group and stop receiving emails from it,
> > send
> > >>>>> an email to asterixdb-dev+unsubscribe@googlegroups.com.
> > >>>>> For more options, visit https://groups.google.com/d/optout.
> > >>>>>
> > >>>>
> > >>>> --
> > >>>> You received this message because you are subscribed to the Google
> > >>>> Groups "asterixdb-dev" group.
> > >>>> To unsubscribe from this group and stop receiving emails from it,
> send
> > >>>> an email to asterixdb-dev+unsubscribe@googlegroups.com.
> > >>>> For more options, visit https://groups.google.com/d/optout.
> > >>>>
> > >>> --
> > >>> You received this message because you are subscribed to the Google
> > >>> Groups "asterixdb-users" group.
> > >>> To unsubscribe from this group and stop receiving emails from it,
> send
> > >>> an email to asterixdb-users+unsubscribe@googlegroups.com.
> > >>> For more options, visit https://groups.google.com/d/optout.
> > >>>
> > >>
> > >>
> > >>
> > >> --
> > >> Best,
> > >> Yiran
> > >>
> > >> --
> > >> You received this message because you are subscribed to the Google
> > Groups
> > >> "asterixdb-dev" group.
> > >> To unsubscribe from this group and stop receiving emails from it, send
> > an
> > >> email to asterixdb-dev+unsubscribe@googlegroups.com.
> > >> For more options, visit https://groups.google.com/d/optout.
> > >>
> > >
> > >
> >
>

Re: Cannot load an index that is not empty [TreeIndexException]

Posted by Mike Carey <dt...@gmail.com>.
Sounds like the load job parallelism needs a redo - it probably shouldn't
be more than the number of target partitions IMO...?
On Feb 20, 2016 12:41 PM, "abdullah alamoudi" <ba...@gmail.com> wrote:

> I have an idea that might explain why such a strange behavior happened. I
> believe it could be due to the number of task partitions being very high
> assuming each of the 76 files is being read in a separate task.
> This could potentially lead to some corner cases that we didn't consider
> before considering the number of threads in the tasks thread pool is less
> than 76, some tasks will not be able to start until others have completed
> execution.
>
> Just a thought,
> Abdullah.
>
> On Fri, Feb 19, 2016 at 9:43 PM, abdullah alamoudi <ba...@gmail.com>
> wrote:
>
> > Yiran,
> > Here is one problem causing a failure:
> > edu.uci.ics.hyracks.api.exceptions.HyracksDataException:
> > edu.uci.ics.hyracks.api.exceptions.HyracksDataException:
> >
> edu.uci.ics.hyracks.storage.am.common.exceptions.TreeIndexDuplicateKeyException:
> > Input stream given to BTree bulk load has duplicates.
> >
> > which tells us that Input stream given to BTree bulk load has duplicates.
> > The question is why this was not returned as the error message? We need
> to
> > look into that.
> >
> > I will continue looking at the log file to see if there were other
> issues.
> >
> > Can you share with us the load statement you're using? I would like to
> see
> > how you're loading all the files. we might be able to suggest a way to
> make
> > it work better.
> >
> > Cheers,
> > Abdullah.
> >
> > On Fri, Feb 19, 2016 at 9:31 PM, Yiran Wang <wy...@gmail.com> wrote:
> >
> >> Abdullah,
> >>
> >> Here is the log attached. Thank you all very much for looking into this.
> >>
> >> Ian - I have two query questions besides this loading issue. I was
> >> wondering if I can meet briefly with you (or over email) regarding that.
> >>
> >> Thanks!
> >> Yiran
> >>
> >> On Fri, Feb 19, 2016 at 9:38 AM, Mike Carey <dt...@gmail.com> wrote:
> >>
> >>> Maybe Ian can visit the cluster with Yiran later today?
> >>> On Feb 19, 2016 1:31 AM, "abdullah alamoudi" <ba...@gmail.com>
> wrote:
> >>>
> >>>> Yiran,
> >>>> Can you share the logs? It would help us identifying the actual cause
> >>>> of this failure much faster.
> >>>>
> >>>> I am pretty sure you know this but in case you didn't, you can get the
> >>>> logs using
> >>>> >managix log -n <instance-name>
> >>>>
> >>>> Also, it would be nice if someone from the team has access to the
> >>>> cluster so we can work with it directly.
> >>>> Cheers,
> >>>> Abdullah.
> >>>>
> >>>>
> >>>> On Fri, Feb 19, 2016 at 9:40 AM, Yiran Wang <wy...@gmail.com>
> wrote:
> >>>>
> >>>>> Steven,
> >>>>>
> >>>>> Thanks for getting back to me so quickly! I wasn't clear. Here is
> what
> >>>>> happened:
> >>>>>
> >>>>> I test-loaded the first 32 files, no problem. I deleted the dataset,
> >>>>> created a new one, and tried to load the entire 76 files into the
> newly
> >>>>> created (hence empty) dataset.
> >>>>>
> >>>>> It took about 2mins after executing the query for the error message
> to
> >>>>> show up. There are currently 31710406 rows of data in the dataset,
> despite
> >>>>> the error message (so it looks like it did load).
> >>>>>
> >>>>> So my questions are: 1) why did I still get that error message when I
> >>>>> was loading to an empty dataset; and 2) I'm not sure if all the data
> from
> >>>>> the 76 file are fully loaded. Is there other ways to check, besides
> trying
> >>>>> to load it again and hope this time I don't get the error?
> >>>>>
> >>>>> Thanks!
> >>>>> Yiran
> >>>>>
> >>>>> On Thu, Feb 18, 2016 at 10:29 PM, Steven Jacobs <sj...@ucr.edu>
> >>>>> wrote:
> >>>>>
> >>>>>> Hi,
> >>>>>> Welcome! We are an Apache incubator project now so I added the
> >>>>>> correct mailing list. Our "load" statement only works on an empty
> dataset.
> >>>>>> Subsequent data needs to be added with an insert or a feed. You
> should be
> >>>>>> able to load all 76 files at once though (starting from empty).
> >>>>>> Steven
> >>>>>>
> >>>>>>
> >>>>>> On Thursday, February 18, 2016, Yiran Wang <wy...@gmail.com>
> wrote:
> >>>>>>
> >>>>>>> Hi Asterix team!
> >>>>>>>
> >>>>>>> I've come across this error when I was trying to load 76 files into
> >>>>>>> a dataset. When I test-loaded the first 32 files, there wasn't
> such an
> >>>>>>> error. All 76 files are of the same data format.
> >>>>>>>
> >>>>>>> Can you help interpret what this error message means?
> >>>>>>>
> >>>>>>> Thanks!
> >>>>>>> Yiran
> >>>>>>>
> >>>>>>> --
> >>>>>>> Best,
> >>>>>>> Yiran
> >>>>>>>
> >>>>>>> --
> >>>>>>> You received this message because you are subscribed to the Google
> >>>>>>> Groups "asterixdb-dev" group.
> >>>>>>> To unsubscribe from this group and stop receiving emails from it,
> >>>>>>> send an email to asterixdb-dev+unsubscribe@googlegroups.com.
> >>>>>>> For more options, visit https://groups.google.com/d/optout.
> >>>>>>>
> >>>>>> --
> >>>>>> You received this message because you are subscribed to the Google
> >>>>>> Groups "asterixdb-users" group.
> >>>>>> To unsubscribe from this group and stop receiving emails from it,
> >>>>>> send an email to asterixdb-users+unsubscribe@googlegroups.com.
> >>>>>> For more options, visit https://groups.google.com/d/optout.
> >>>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> Best,
> >>>>> Yiran
> >>>>>
> >>>>> --
> >>>>> You received this message because you are subscribed to the Google
> >>>>> Groups "asterixdb-dev" group.
> >>>>> To unsubscribe from this group and stop receiving emails from it,
> send
> >>>>> an email to asterixdb-dev+unsubscribe@googlegroups.com.
> >>>>> For more options, visit https://groups.google.com/d/optout.
> >>>>>
> >>>>
> >>>> --
> >>>> You received this message because you are subscribed to the Google
> >>>> Groups "asterixdb-dev" group.
> >>>> To unsubscribe from this group and stop receiving emails from it, send
> >>>> an email to asterixdb-dev+unsubscribe@googlegroups.com.
> >>>> For more options, visit https://groups.google.com/d/optout.
> >>>>
> >>> --
> >>> You received this message because you are subscribed to the Google
> >>> Groups "asterixdb-users" group.
> >>> To unsubscribe from this group and stop receiving emails from it, send
> >>> an email to asterixdb-users+unsubscribe@googlegroups.com.
> >>> For more options, visit https://groups.google.com/d/optout.
> >>>
> >>
> >>
> >>
> >> --
> >> Best,
> >> Yiran
> >>
> >> --
> >> You received this message because you are subscribed to the Google
> Groups
> >> "asterixdb-dev" group.
> >> To unsubscribe from this group and stop receiving emails from it, send
> an
> >> email to asterixdb-dev+unsubscribe@googlegroups.com.
> >> For more options, visit https://groups.google.com/d/optout.
> >>
> >
> >
>

Re: Cannot load an index that is not empty [TreeIndexException]

Posted by abdullah alamoudi <ba...@gmail.com>.
I have an idea that might explain why such a strange behavior happened. I
believe it could be due to the number of task partitions being very high
assuming each of the 76 files is being read in a separate task.
This could potentially lead to some corner cases that we didn't consider
before considering the number of threads in the tasks thread pool is less
than 76, some tasks will not be able to start until others have completed
execution.

Just a thought,
Abdullah.

On Fri, Feb 19, 2016 at 9:43 PM, abdullah alamoudi <ba...@gmail.com>
wrote:

> Yiran,
> Here is one problem causing a failure:
> edu.uci.ics.hyracks.api.exceptions.HyracksDataException:
> edu.uci.ics.hyracks.api.exceptions.HyracksDataException:
> edu.uci.ics.hyracks.storage.am.common.exceptions.TreeIndexDuplicateKeyException:
> Input stream given to BTree bulk load has duplicates.
>
> which tells us that Input stream given to BTree bulk load has duplicates.
> The question is why this was not returned as the error message? We need to
> look into that.
>
> I will continue looking at the log file to see if there were other issues.
>
> Can you share with us the load statement you're using? I would like to see
> how you're loading all the files. we might be able to suggest a way to make
> it work better.
>
> Cheers,
> Abdullah.
>
> On Fri, Feb 19, 2016 at 9:31 PM, Yiran Wang <wy...@gmail.com> wrote:
>
>> Abdullah,
>>
>> Here is the log attached. Thank you all very much for looking into this.
>>
>> Ian - I have two query questions besides this loading issue. I was
>> wondering if I can meet briefly with you (or over email) regarding that.
>>
>> Thanks!
>> Yiran
>>
>> On Fri, Feb 19, 2016 at 9:38 AM, Mike Carey <dt...@gmail.com> wrote:
>>
>>> Maybe Ian can visit the cluster with Yiran later today?
>>> On Feb 19, 2016 1:31 AM, "abdullah alamoudi" <ba...@gmail.com> wrote:
>>>
>>>> Yiran,
>>>> Can you share the logs? It would help us identifying the actual cause
>>>> of this failure much faster.
>>>>
>>>> I am pretty sure you know this but in case you didn't, you can get the
>>>> logs using
>>>> >managix log -n <instance-name>
>>>>
>>>> Also, it would be nice if someone from the team has access to the
>>>> cluster so we can work with it directly.
>>>> Cheers,
>>>> Abdullah.
>>>>
>>>>
>>>> On Fri, Feb 19, 2016 at 9:40 AM, Yiran Wang <wy...@gmail.com> wrote:
>>>>
>>>>> Steven,
>>>>>
>>>>> Thanks for getting back to me so quickly! I wasn't clear. Here is what
>>>>> happened:
>>>>>
>>>>> I test-loaded the first 32 files, no problem. I deleted the dataset,
>>>>> created a new one, and tried to load the entire 76 files into the newly
>>>>> created (hence empty) dataset.
>>>>>
>>>>> It took about 2mins after executing the query for the error message to
>>>>> show up. There are currently 31710406 rows of data in the dataset, despite
>>>>> the error message (so it looks like it did load).
>>>>>
>>>>> So my questions are: 1) why did I still get that error message when I
>>>>> was loading to an empty dataset; and 2) I'm not sure if all the data from
>>>>> the 76 file are fully loaded. Is there other ways to check, besides trying
>>>>> to load it again and hope this time I don't get the error?
>>>>>
>>>>> Thanks!
>>>>> Yiran
>>>>>
>>>>> On Thu, Feb 18, 2016 at 10:29 PM, Steven Jacobs <sj...@ucr.edu>
>>>>> wrote:
>>>>>
>>>>>> Hi,
>>>>>> Welcome! We are an Apache incubator project now so I added the
>>>>>> correct mailing list. Our "load" statement only works on an empty dataset.
>>>>>> Subsequent data needs to be added with an insert or a feed. You should be
>>>>>> able to load all 76 files at once though (starting from empty).
>>>>>> Steven
>>>>>>
>>>>>>
>>>>>> On Thursday, February 18, 2016, Yiran Wang <wy...@gmail.com> wrote:
>>>>>>
>>>>>>> Hi Asterix team!
>>>>>>>
>>>>>>> I've come across this error when I was trying to load 76 files into
>>>>>>> a dataset. When I test-loaded the first 32 files, there wasn't such an
>>>>>>> error. All 76 files are of the same data format.
>>>>>>>
>>>>>>> Can you help interpret what this error message means?
>>>>>>>
>>>>>>> Thanks!
>>>>>>> Yiran
>>>>>>>
>>>>>>> --
>>>>>>> Best,
>>>>>>> Yiran
>>>>>>>
>>>>>>> --
>>>>>>> You received this message because you are subscribed to the Google
>>>>>>> Groups "asterixdb-dev" group.
>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>> send an email to asterixdb-dev+unsubscribe@googlegroups.com.
>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>
>>>>>> --
>>>>>> You received this message because you are subscribed to the Google
>>>>>> Groups "asterixdb-users" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>> send an email to asterixdb-users+unsubscribe@googlegroups.com.
>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best,
>>>>> Yiran
>>>>>
>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "asterixdb-dev" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to asterixdb-dev+unsubscribe@googlegroups.com.
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "asterixdb-dev" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to asterixdb-dev+unsubscribe@googlegroups.com.
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "asterixdb-users" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to asterixdb-users+unsubscribe@googlegroups.com.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>
>>
>> --
>> Best,
>> Yiran
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "asterixdb-dev" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to asterixdb-dev+unsubscribe@googlegroups.com.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

Re: Cannot load an index that is not empty [TreeIndexException]

Posted by abdullah alamoudi <ba...@gmail.com>.
Yiran,
Here is one problem causing a failure:
edu.uci.ics.hyracks.api.exceptions.HyracksDataException:
edu.uci.ics.hyracks.api.exceptions.HyracksDataException:
edu.uci.ics.hyracks.storage.am.common.exceptions.TreeIndexDuplicateKeyException:
Input stream given to BTree bulk load has duplicates.

which tells us that Input stream given to BTree bulk load has duplicates.
The question is why this was not returned as the error message? We need to
look into that.

I will continue looking at the log file to see if there were other issues.

Can you share with us the load statement you're using? I would like to see
how you're loading all the files. we might be able to suggest a way to make
it work better.

Cheers,
Abdullah.

On Fri, Feb 19, 2016 at 9:31 PM, Yiran Wang <wy...@gmail.com> wrote:

> Abdullah,
>
> Here is the log attached. Thank you all very much for looking into this.
>
> Ian - I have two query questions besides this loading issue. I was
> wondering if I can meet briefly with you (or over email) regarding that.
>
> Thanks!
> Yiran
>
> On Fri, Feb 19, 2016 at 9:38 AM, Mike Carey <dt...@gmail.com> wrote:
>
>> Maybe Ian can visit the cluster with Yiran later today?
>> On Feb 19, 2016 1:31 AM, "abdullah alamoudi" <ba...@gmail.com> wrote:
>>
>>> Yiran,
>>> Can you share the logs? It would help us identifying the actual cause of
>>> this failure much faster.
>>>
>>> I am pretty sure you know this but in case you didn't, you can get the
>>> logs using
>>> >managix log -n <instance-name>
>>>
>>> Also, it would be nice if someone from the team has access to the
>>> cluster so we can work with it directly.
>>> Cheers,
>>> Abdullah.
>>>
>>>
>>> On Fri, Feb 19, 2016 at 9:40 AM, Yiran Wang <wy...@gmail.com> wrote:
>>>
>>>> Steven,
>>>>
>>>> Thanks for getting back to me so quickly! I wasn't clear. Here is what
>>>> happened:
>>>>
>>>> I test-loaded the first 32 files, no problem. I deleted the dataset,
>>>> created a new one, and tried to load the entire 76 files into the newly
>>>> created (hence empty) dataset.
>>>>
>>>> It took about 2mins after executing the query for the error message to
>>>> show up. There are currently 31710406 rows of data in the dataset, despite
>>>> the error message (so it looks like it did load).
>>>>
>>>> So my questions are: 1) why did I still get that error message when I
>>>> was loading to an empty dataset; and 2) I'm not sure if all the data from
>>>> the 76 file are fully loaded. Is there other ways to check, besides trying
>>>> to load it again and hope this time I don't get the error?
>>>>
>>>> Thanks!
>>>> Yiran
>>>>
>>>> On Thu, Feb 18, 2016 at 10:29 PM, Steven Jacobs <sj...@ucr.edu>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>> Welcome! We are an Apache incubator project now so I added the correct
>>>>> mailing list. Our "load" statement only works on an empty dataset.
>>>>> Subsequent data needs to be added with an insert or a feed. You should be
>>>>> able to load all 76 files at once though (starting from empty).
>>>>> Steven
>>>>>
>>>>>
>>>>> On Thursday, February 18, 2016, Yiran Wang <wy...@gmail.com> wrote:
>>>>>
>>>>>> Hi Asterix team!
>>>>>>
>>>>>> I've come across this error when I was trying to load 76 files into a
>>>>>> dataset. When I test-loaded the first 32 files, there wasn't such an error.
>>>>>> All 76 files are of the same data format.
>>>>>>
>>>>>> Can you help interpret what this error message means?
>>>>>>
>>>>>> Thanks!
>>>>>> Yiran
>>>>>>
>>>>>> --
>>>>>> Best,
>>>>>> Yiran
>>>>>>
>>>>>> --
>>>>>> You received this message because you are subscribed to the Google
>>>>>> Groups "asterixdb-dev" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>> send an email to asterixdb-dev+unsubscribe@googlegroups.com.
>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>
>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "asterixdb-users" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to asterixdb-users+unsubscribe@googlegroups.com.
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Best,
>>>> Yiran
>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "asterixdb-dev" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to asterixdb-dev+unsubscribe@googlegroups.com.
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "asterixdb-dev" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to asterixdb-dev+unsubscribe@googlegroups.com.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "asterixdb-users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to asterixdb-users+unsubscribe@googlegroups.com.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
>
> --
> Best,
> Yiran
>
> --
> You received this message because you are subscribed to the Google Groups
> "asterixdb-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to asterixdb-dev+unsubscribe@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

Re: Cannot load an index that is not empty [TreeIndexException]

Posted by Yiran Wang <wy...@gmail.com>.
Abdullah,

Here is the log attached. Thank you all very much for looking into this.

Ian - I have two query questions besides this loading issue. I was
wondering if I can meet briefly with you (or over email) regarding that.

Thanks!
Yiran

On Fri, Feb 19, 2016 at 9:38 AM, Mike Carey <dt...@gmail.com> wrote:

> Maybe Ian can visit the cluster with Yiran later today?
> On Feb 19, 2016 1:31 AM, "abdullah alamoudi" <ba...@gmail.com> wrote:
>
>> Yiran,
>> Can you share the logs? It would help us identifying the actual cause of
>> this failure much faster.
>>
>> I am pretty sure you know this but in case you didn't, you can get the
>> logs using
>> >managix log -n <instance-name>
>>
>> Also, it would be nice if someone from the team has access to the cluster
>> so we can work with it directly.
>> Cheers,
>> Abdullah.
>>
>>
>> On Fri, Feb 19, 2016 at 9:40 AM, Yiran Wang <wy...@gmail.com> wrote:
>>
>>> Steven,
>>>
>>> Thanks for getting back to me so quickly! I wasn't clear. Here is what
>>> happened:
>>>
>>> I test-loaded the first 32 files, no problem. I deleted the dataset,
>>> created a new one, and tried to load the entire 76 files into the newly
>>> created (hence empty) dataset.
>>>
>>> It took about 2mins after executing the query for the error message to
>>> show up. There are currently 31710406 rows of data in the dataset, despite
>>> the error message (so it looks like it did load).
>>>
>>> So my questions are: 1) why did I still get that error message when I
>>> was loading to an empty dataset; and 2) I'm not sure if all the data from
>>> the 76 file are fully loaded. Is there other ways to check, besides trying
>>> to load it again and hope this time I don't get the error?
>>>
>>> Thanks!
>>> Yiran
>>>
>>> On Thu, Feb 18, 2016 at 10:29 PM, Steven Jacobs <sj...@ucr.edu>
>>> wrote:
>>>
>>>> Hi,
>>>> Welcome! We are an Apache incubator project now so I added the correct
>>>> mailing list. Our "load" statement only works on an empty dataset.
>>>> Subsequent data needs to be added with an insert or a feed. You should be
>>>> able to load all 76 files at once though (starting from empty).
>>>> Steven
>>>>
>>>>
>>>> On Thursday, February 18, 2016, Yiran Wang <wy...@gmail.com> wrote:
>>>>
>>>>> Hi Asterix team!
>>>>>
>>>>> I've come across this error when I was trying to load 76 files into a
>>>>> dataset. When I test-loaded the first 32 files, there wasn't such an error.
>>>>> All 76 files are of the same data format.
>>>>>
>>>>> Can you help interpret what this error message means?
>>>>>
>>>>> Thanks!
>>>>> Yiran
>>>>>
>>>>> --
>>>>> Best,
>>>>> Yiran
>>>>>
>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "asterixdb-dev" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to asterixdb-dev+unsubscribe@googlegroups.com.
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "asterixdb-users" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to asterixdb-users+unsubscribe@googlegroups.com.
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>>
>>>
>>> --
>>> Best,
>>> Yiran
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "asterixdb-dev" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to asterixdb-dev+unsubscribe@googlegroups.com.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "asterixdb-dev" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to asterixdb-dev+unsubscribe@googlegroups.com.
>> For more options, visit https://groups.google.com/d/optout.
>>
> --
> You received this message because you are subscribed to the Google Groups
> "asterixdb-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to asterixdb-users+unsubscribe@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Best,
Yiran

Re: Cannot load an index that is not empty [TreeIndexException]

Posted by Mike Carey <dt...@gmail.com>.
Maybe Ian can visit the cluster with Yiran later today?
On Feb 19, 2016 1:31 AM, "abdullah alamoudi" <ba...@gmail.com> wrote:

> Yiran,
> Can you share the logs? It would help us identifying the actual cause of
> this failure much faster.
>
> I am pretty sure you know this but in case you didn't, you can get the
> logs using
> >managix log -n <instance-name>
>
> Also, it would be nice if someone from the team has access to the cluster
> so we can work with it directly.
> Cheers,
> Abdullah.
>
>
> On Fri, Feb 19, 2016 at 9:40 AM, Yiran Wang <wy...@gmail.com> wrote:
>
>> Steven,
>>
>> Thanks for getting back to me so quickly! I wasn't clear. Here is what
>> happened:
>>
>> I test-loaded the first 32 files, no problem. I deleted the dataset,
>> created a new one, and tried to load the entire 76 files into the newly
>> created (hence empty) dataset.
>>
>> It took about 2mins after executing the query for the error message to
>> show up. There are currently 31710406 rows of data in the dataset, despite
>> the error message (so it looks like it did load).
>>
>> So my questions are: 1) why did I still get that error message when I was
>> loading to an empty dataset; and 2) I'm not sure if all the data from the
>> 76 file are fully loaded. Is there other ways to check, besides trying to
>> load it again and hope this time I don't get the error?
>>
>> Thanks!
>> Yiran
>>
>> On Thu, Feb 18, 2016 at 10:29 PM, Steven Jacobs <sj...@ucr.edu> wrote:
>>
>>> Hi,
>>> Welcome! We are an Apache incubator project now so I added the correct
>>> mailing list. Our "load" statement only works on an empty dataset.
>>> Subsequent data needs to be added with an insert or a feed. You should be
>>> able to load all 76 files at once though (starting from empty).
>>> Steven
>>>
>>>
>>> On Thursday, February 18, 2016, Yiran Wang <wy...@gmail.com> wrote:
>>>
>>>> Hi Asterix team!
>>>>
>>>> I've come across this error when I was trying to load 76 files into a
>>>> dataset. When I test-loaded the first 32 files, there wasn't such an error.
>>>> All 76 files are of the same data format.
>>>>
>>>> Can you help interpret what this error message means?
>>>>
>>>> Thanks!
>>>> Yiran
>>>>
>>>> --
>>>> Best,
>>>> Yiran
>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "asterixdb-dev" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to asterixdb-dev+unsubscribe@googlegroups.com.
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "asterixdb-users" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to asterixdb-users+unsubscribe@googlegroups.com.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>
>>
>> --
>> Best,
>> Yiran
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "asterixdb-dev" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to asterixdb-dev+unsubscribe@googlegroups.com.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
> --
> You received this message because you are subscribed to the Google Groups
> "asterixdb-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to asterixdb-dev+unsubscribe@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

Re: Cannot load an index that is not empty [TreeIndexException]

Posted by abdullah alamoudi <ba...@gmail.com>.
Yiran,
Can you share the logs? It would help us identifying the actual cause of
this failure much faster.

I am pretty sure you know this but in case you didn't, you can get the logs
using
>managix log -n <instance-name>

Also, it would be nice if someone from the team has access to the cluster
so we can work with it directly.
Cheers,
Abdullah.


On Fri, Feb 19, 2016 at 9:40 AM, Yiran Wang <wy...@gmail.com> wrote:

> Steven,
>
> Thanks for getting back to me so quickly! I wasn't clear. Here is what
> happened:
>
> I test-loaded the first 32 files, no problem. I deleted the dataset,
> created a new one, and tried to load the entire 76 files into the newly
> created (hence empty) dataset.
>
> It took about 2mins after executing the query for the error message to
> show up. There are currently 31710406 rows of data in the dataset, despite
> the error message (so it looks like it did load).
>
> So my questions are: 1) why did I still get that error message when I was
> loading to an empty dataset; and 2) I'm not sure if all the data from the
> 76 file are fully loaded. Is there other ways to check, besides trying to
> load it again and hope this time I don't get the error?
>
> Thanks!
> Yiran
>
> On Thu, Feb 18, 2016 at 10:29 PM, Steven Jacobs <sj...@ucr.edu> wrote:
>
>> Hi,
>> Welcome! We are an Apache incubator project now so I added the correct
>> mailing list. Our "load" statement only works on an empty dataset.
>> Subsequent data needs to be added with an insert or a feed. You should be
>> able to load all 76 files at once though (starting from empty).
>> Steven
>>
>>
>> On Thursday, February 18, 2016, Yiran Wang <wy...@gmail.com> wrote:
>>
>>> Hi Asterix team!
>>>
>>> I've come across this error when I was trying to load 76 files into a
>>> dataset. When I test-loaded the first 32 files, there wasn't such an error.
>>> All 76 files are of the same data format.
>>>
>>> Can you help interpret what this error message means?
>>>
>>> Thanks!
>>> Yiran
>>>
>>> --
>>> Best,
>>> Yiran
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "asterixdb-dev" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to asterixdb-dev+unsubscribe@googlegroups.com.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "asterixdb-users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to asterixdb-users+unsubscribe@googlegroups.com.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
>
> --
> Best,
> Yiran
>
> --
> You received this message because you are subscribed to the Google Groups
> "asterixdb-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to asterixdb-dev+unsubscribe@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>