You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@asterixdb.apache.org by Mike Carey <dt...@gmail.com> on 2016/09/14 04:28:12 UTC

Re: Creating RTree: no space left

I can't remember (slight jetlag? :-)) if I shared back to this list one 
theory that came up in India when Wail and I talked F2F - his data has a 
lot of duplicate points, so maybe something goes awry in that case.  I 
wonder if we've sufficiently tested that case?  (E.g., what if there are 
gazillions of records originating from a small handful of points?)

On 8/26/16 9:55 AM, Taewoo Kim wrote:
> Based on a rough calculation, per partition, each point field takes 3.6GB
> (16 bytes * 2887453794 records / 12 partition). To sort 3.6GB, we are
> generating 625 files (96MB or 128MB each) = 157GB. Since Wail mentioned
> that there was no issue when creating a B+ tree index, we need to check
> what SORT process is required by R-Tree index.
>
> Best,
> Taewoo
>
> On Fri, Aug 26, 2016 at 7:52 AM, Jianfeng Jia <ji...@gmail.com>
> wrote:
>
>> If all of the file names start with \u201cExternalSortRunGenerator\u201d, then they
>> are the first round files which can not be GCed.
>> Could you provide the query plan as well?
>>
>>> On Aug 24, 2016, at 10:02 PM, Wail Alkowaileet <wa...@gmail.com>
>> wrote:
>>> Hi Ian and Pouria,
>>>
>>> The name of the files along with the sizes (there were 625 one of those
>>> before crashing):
>>>
>>> size        name
>>> 96MB     ExternalSortRunGenerator8917133039835449370.waf
>>> 128MB   ExternalSortRunGenerator8948724728025392343.waf
>>>
>>> no files were generated beyond runs.
>>> compiler.sortmemory = 64MB
>>>
>>> Here is the full logs
>>> <https://www.dropbox.com/s/k2qbo3wybc8mnnk/log_Thu_Aug_
>> 25_07%3A34%3A52_AST_2016.zip?dl=0>
>>> On Tue, Aug 23, 2016 at 9:29 PM, Pouria Pirzadeh <
>> pouria.pirzadeh@gmail.com>
>>> wrote:
>>>
>>>> We previously had issues with huge spilled sort temp files when creating
>>>> inverted index for fuzzy queries, but NOT R-Trees.
>>>> I also recall that Yingyi fixed the issue of delaying clean-up for
>>>> intermediate temp files until the end of the query execution.
>>>> If you can share names of a couple of temp files (and their sizes along
>>>> with the sort memory setting you have in asterix-configuration.xml) we
>> may
>>>> be able to have a better guess as if the sort is really going into a
>>>> two-level merge or not.
>>>>
>>>> Pouria
>>>>
>>>> On Tue, Aug 23, 2016 at 11:09 AM, Ian Maxon <im...@uci.edu> wrote:
>>>>
>>>>> I think that execption ("No space left on device") is just casted from
>>>> the
>>>>> native IOException. Therefore I would be inclined to believe it's
>>>> genuinely
>>>>> out of space. I suppose the question is why the external sort is so
>> huge.
>>>>> What is the query plan? Maybe that will shed light on a possible cause.
>>>>>
>>>>> On Tue, Aug 23, 2016 at 9:59 AM, Wail Alkowaileet <wa...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> I was monitoring Inodes ... it didn't go beyond 1%.
>>>>>>
>>>>>> On Tue, Aug 23, 2016 at 7:58 PM, Wail Alkowaileet <wael.y.k@gmail.com
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Chris and Mike,
>>>>>>>
>>>>>>> Actually I was monitoring it to see what's going on:
>>>>>>>
>>>>>>>    - The size of each partition is about 40GB (80GB in total per
>>>>>>>    iodevice).
>>>>>>>    - The runs took 157GB per iodevice (about 2x of the dataset size).
>>>>>>>    Each run takes either of 128MB or 96MB of storage.
>>>>>>>    - At a certain time, there were 522 runs.
>>>>>>>
>>>>>>> I even tried to create a BTree Index to see if that happens as well.
>>>> I
>>>>>>> created two BTree indexes one for the *location* and one for the
>>>>> *caller
>>>>>> *and
>>>>>>> they were created successfully. The sizes of the runs didn't take
>>>>> anyway
>>>>>>> near that.
>>>>>>>
>>>>>>> Logs are attached.
>>>>>>>
>>>>>>> On Tue, Aug 23, 2016 at 7:19 PM, Mike Carey <dt...@gmail.com>
>>>> wrote:
>>>>>>>> I think we might have "file GC issues" - I vaguely remember that we
>>>>>> don't
>>>>>>>> (or at least didn't once upon a time) proactively remove unnecessary
>>>>> run
>>>>>>>> files - removing all of them at end-of-job instead of at the end of
>>>>> the
>>>>>>>> execution phase that uses their contents.  We may also have an
>>>> "Amdahl
>>>>>>>> problem" right now with our sort since we serialize phase two of
>>>>>> parallel
>>>>>>>> sorts - though this is not a query, it's index build, so that
>>>>> shouldn't
>>>>>> be
>>>>>>>> it.  It would be interesting to put a df/sleep script on each of the
>>>>>> nodes
>>>>>>>> when this is happening - actually a script that monitors the temp
>>>> file
>>>>>>>> directory - and watch the lifecycle happen and the sizes change....
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 8/23/16 2:06 AM, Chris Hillery wrote:
>>>>>>>>
>>>>>>>>> When you get the "disk full" warning, do a quick "df -i" on the
>>>>> device
>>>>>> -
>>>>>>>>> possibly you've run out of inodes even if the space isn't all used
>>>>> up.
>>>>>>>>> It's
>>>>>>>>> unlikely because I don't think AsterixDB creates a bunch of small
>>>>>> files,
>>>>>>>>> but worth checking.
>>>>>>>>>
>>>>>>>>> If that's not it, then can you share the full exception and stack
>>>>>> trace?
>>>>>>>>> Ceej
>>>>>>>>> aka Chris Hillery
>>>>>>>>>
>>>>>>>>> On Tue, Aug 23, 2016 at 1:59 AM, Wail Alkowaileet <
>>>>> wael.y.k@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> I just cleared the hard drives to get 80% free space. I still get
>>>> the
>>>>>>>>>> same
>>>>>>>>>> issue.
>>>>>>>>>>
>>>>>>>>>> The data contains:
>>>>>>>>>> 1- 2887453794 records.
>>>>>>>>>> 2- Schema:
>>>>>>>>>>
>>>>>>>>>> create type CDRType as {
>>>>>>>>>>
>>>>>>>>>> id:uuid,
>>>>>>>>>>
>>>>>>>>>> 'date':string,
>>>>>>>>>>
>>>>>>>>>> 'time':string,
>>>>>>>>>>
>>>>>>>>>> 'duration':int64,
>>>>>>>>>>
>>>>>>>>>> 'caller':int64,
>>>>>>>>>>
>>>>>>>>>> 'callee':int64,
>>>>>>>>>>
>>>>>>>>>> location:point?
>>>>>>>>>>
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Tue, Aug 23, 2016 at 9:06 AM, Wail Alkowaileet <
>>>>> wael.y.k@gmail.com
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> Dears,
>>>>>>>>>>> I have a dataset of size 290GB loaded in a 3 NCs each of which
>>>> has
>>>>>>>>>> 2x500GB
>>>>>>>>>>
>>>>>>>>>>> SSD.
>>>>>>>>>>>
>>>>>>>>>>> Each of NC has two IODevices (partitions) in each hard drive (i.e
>>>>> the
>>>>>>>>>>> total is 4 iodevices per NC). After loading the data, each
>>>> Asterix
>>>>>>>>>>> partition occupied 31GB.
>>>>>>>>>>>
>>>>>>>>>>> The cluster has about 50% free space in each hard drive
>>>>>> (approximately
>>>>>>>>>>> about 250GB free space in each hard drive). However, when I tried
>>>>> to
>>>>>>>>>> create
>>>>>>>>>>
>>>>>>>>>>> an index of type RTree, I got an exception that no space left in
>>>>> the
>>>>>>>>>>> hard
>>>>>>>>>>> drive during the External Sort phase.
>>>>>>>>>>>
>>>>>>>>>>> Is that normal ?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>>
>>>>>>>>>>> *Regards,*
>>>>>>>>>>> Wail Alkowaileet
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>>
>>>>>>>>>> *Regards,*
>>>>>>>>>> Wail Alkowaileet
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>> *Regards,*
>>>>>>> Wail Alkowaileet
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>> *Regards,*
>>>>>> Wail Alkowaileet
>>>>>>
>>>
>>>
>>> --
>>>
>>> *Regards,*
>>> Wail Alkowaileet
>>
>>
>> Best,
>>
>> Jianfeng Jia
>> PhD Candidate of Computer Science
>> University of California, Irvine
>>
>>

Re: Creating RTree: no space left

Posted by Ahmed Eldawy <el...@cs.ucr.edu>.

If you're interested in testing a big dataset, you can try this OSM dataset
which comes in a simple CSV format.
https://drive.google.com/open?id=0B1jY75xGiy7eNjJuRy1KWjRieVU
It contains 2.7 billion records with a very few duplicates.
This is how the dataset looks like
http://spatialhadoop.cs.umn.edu/datasets/osm2/all_nodes.pyramid/

Thanks
Ahmed

On Thu, Sep 15, 2016 at 10:13 PM Khurram Faraaz <kh...@gmail.com>
wrote:

> @Pouria here is Uber trip data
>
> https://github.com/fivethirtyeight/uber-tlc-foil-response
>
> On Sep 16, 2016 1:21 AM, "Chen Li" <ch...@gmail.com> wrote:
>
> > @Wail: as a use case related to selectivity, our current Cloudberry
> > prototype doesn't benefit from R-tree when the user is analyzing the data
> > for the entire US.  But we expect to have R-tree benefits when a user
> zooms
> > into a small region.
> >
> > On Thu, Sep 15, 2016 at 8:25 AM, Wail Alkowaileet <wa...@gmail.com>
> > wrote:
> >
> > > Hi Ahmed and Mike,
> > >
> > > @Ahmed
> > > I actually did a small experiment where I loaded about 1/5 of the data
> > (so
> > > I can index it) and seems that the R-Tree was really useful for
> querying
> > > small regions or neighborhoods.
> > > I also tried the B-Tree and it was slower than a full scan.
> > >
> > > @Mike
> > > Unfortunately, I cannot still even after anonymization :-)
> > >
> > >
> > > On Wed, Sep 14, 2016 at 11:29 PM, Mike Carey <dt...@gmail.com>
> wrote:
> > >
> > > > Interesting point, so to speak.  @Wail, any chance you could post a
> > > Google
> > > > maps screenshot showing a visualization of the points in this dataset
> > on
> > > > the underlying geographic region?  (If the dataset is shareable in
> that
> > > > anonymized form?)  I would think an R-tree would still be good for
> > > > small-region geo queries - possibly shrinking the candidate object
> set
> > > by a
> > > > factor of 10,000 - so still useful - and we also do index-AND-ing
> now,
> > so
> > > > we would also combine that shrinkage by other index-provided
> shrinkage
> > on
> > > > any other index-amenable predicates.  I think the queries are still
> > > spatial
> > > > in nature, and the only AsterixDB choices for that are R-tree.  (We
> did
> > > > experiments with things like Hilbert B-trees, but the results led to
> > the
> > > > conclusion that the code base only needs R-trees for spatial data for
> > the
> > > > forseeable future - they just work too well and in a
> no-tuning-required
> > > > fashion.... :-))
> > > >
> > > >
> > > >
> > > > On 9/14/16 12:49 PM, Ahmed Eldawy wrote:
> > > >
> > > >> Looks like an interesting case. Just a small question. Are you sure
> a
> > > >> spatial index is the right one to use here? The spatial attribute
> > looks
> > > >> more like a categorization and a hash or B-tree index could be more
> > > >> suitable. As far as I know, the spatial index in AsterixDB is a
> > > secondary
> > > >> R-tree index which, like any other secondary index, is only good for
> > > >> retrieving a small number of records. For this dataset, it seems
> that
> > > any
> > > >> small range would still return a huge number of records.
> > > >>
> > > >> It is still interesting to further investigate and fix the sort
> issue
> > > but
> > > >> I
> > > >> mentioned the usage issue for a different perspective.
> > > >>
> > > >> Thanks
> > > >> Ahmed
> > > >>
> > > >> On Wed, Sep 14, 2016 at 10:30 AM Mike Carey <dt...@gmail.com>
> > wrote:
> > > >>
> > > >> ☺!
> > > >>>
> > > >>> On Sep 14, 2016 1:11 AM, "Wail Alkowaileet" <wa...@gmail.com>
> > > wrote:
> > > >>>
> > > >>> To be exact
> > > >>>> I have 2,255,091,590 records and 10,391 points :-)
> > > >>>>
> > > >>>> On Wed, Sep 14, 2016 at 10:46 AM, Mike Carey <dt...@gmail.com>
> > > wrote:
> > > >>>>
> > > >>>> Thx!  I knew I'd meant to "activate" the thought somehow, but
> > couldn't
> > > >>>>> remember having done it for sure.  Oops! Scattered from VLDB, I
> > > >>>>>
> > > >>>> guess...!
> > > >>>
> > > >>>>
> > > >>>>>
> > > >>>>> On 9/13/16 9:58 PM, Taewoo Kim wrote:
> > > >>>>>
> > > >>>>> @Mike: You filed an issue -
> > > >>>>>> https://issues.apache.org/jira/browse/ASTERIXDB-1639. :-)
> > > >>>>>>
> > > >>>>>> Best,
> > > >>>>>> Taewoo
> > > >>>>>>
> > > >>>>>> On Tue, Sep 13, 2016 at 9:28 PM, Mike Carey <dt...@gmail.com>
> > > >>>>>>
> > > >>>>> wrote:
> > > >>>
> > > >>>> I can't remember (slight jetlag? :-)) if I shared back to this
> list
> > > >>>>>>
> > > >>>>> one
> > > >>>
> > > >>>> theory that came up in India when Wail and I talked F2F - his data
> > > >>>>>>>
> > > >>>>>> has
> > > >>>
> > > >>>> a
> > > >>>>
> > > >>>>> lot of duplicate points, so maybe something goes awry in that
> case.
> > > >>>>>>>
> > > >>>>>> I
> > > >>>
> > > >>>> wonder if we've sufficiently tested that case?  (E.g., what if
> there
> > > >>>>>>>
> > > >>>>>> are
> > > >>>>
> > > >>>>> gazillions of records originating from a small handful of
> points?)
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>> On 8/26/16 9:55 AM, Taewoo Kim wrote:
> > > >>>>>>>
> > > >>>>>>> Based on a rough calculation, per partition, each point field
> > takes
> > > >>>>>>>
> > > >>>>>> 3.6GB
> > > >>>>
> > > >>>>> (16 bytes * 2887453794 records / 12 partition). To sort 3.6GB, we
> > > >>>>>>>>
> > > >>>>>>> are
> > > >>>
> > > >>>> generating 625 files (96MB or 128MB each) = 157GB. Since Wail
> > > >>>>>>>>
> > > >>>>>>> mentioned
> > > >>>>
> > > >>>>> that there was no issue when creating a B+ tree index, we need to
> > > >>>>>>>>
> > > >>>>>>> check
> > > >>>>
> > > >>>>> what SORT process is required by R-Tree index.
> > > >>>>>>>>
> > > >>>>>>>> Best,
> > > >>>>>>>> Taewoo
> > > >>>>>>>>
> > > >>>>>>>> On Fri, Aug 26, 2016 at 7:52 AM, Jianfeng Jia <
> > > >>>>>>>>
> > > >>>>>>> jianfeng.jia@gmail.com
> > > >>>
> > > >>>> wrote:
> > > >>>>>>>>
> > > >>>>>>>> If all of the file names start with
> “ExternalSortRunGenerator”,
> > > then
> > > >>>>>>>> they
> > > >>>>>>>>
> > > >>>>>>>> are the first round files which can not be GCed.
> > > >>>>>>>>> Could you provide the query plan as well?
> > > >>>>>>>>>
> > > >>>>>>>>> On Aug 24, 2016, at 10:02 PM, Wail Alkowaileet <
> > > wael.y.k@gmail.com
> > > >>>>>>>>> wrote:
> > > >>>>>>>>>
> > > >>>>>>>>> Hi Ian and Pouria,
> > > >>>>>>>>>
> > > >>>>>>>>>> The name of the files along with the sizes (there were 625
> one
> > > of
> > > >>>>>>>>>> those
> > > >>>>>>>>>> before crashing):
> > > >>>>>>>>>>
> > > >>>>>>>>>> size        name
> > > >>>>>>>>>> 96MB     ExternalSortRunGenerator8917133039835449370.waf
> > > >>>>>>>>>> 128MB   ExternalSortRunGenerator8948724728025392343.waf
> > > >>>>>>>>>>
> > > >>>>>>>>>> no files were generated beyond runs.
> > > >>>>>>>>>> compiler.sortmemory = 64MB
> > > >>>>>>>>>>
> > > >>>>>>>>>> Here is the full logs
> > > >>>>>>>>>> <https://www.dropbox.com/s/k2qbo3wybc8mnnk/log_Thu_Aug_
> > > >>>>>>>>>>
> > > >>>>>>>>>> 25_07%3A34%3A52_AST_2016.zip?dl=0>
> > > >>>>>>>>>>
> > > >>>>>>>>> On Tue, Aug 23, 2016 at 9:29 PM, Pouria Pirzadeh <
> > > >>>>>>>>>
> > > >>>>>>>>>> pouria.pirzadeh@gmail.com>
> > > >>>>>>>>>>
> > > >>>>>>>>> wrote:
> > > >>>>>>>>>
> > > >>>>>>>>>> We previously had issues with huge spilled sort temp files
> > when
> > > >>>>>>>>>> creating
> > > >>>>>>>>>>
> > > >>>>>>>>>> inverted index for fuzzy queries, but NOT R-Trees.
> > > >>>>>>>>>>> I also recall that Yingyi fixed the issue of delaying
> > clean-up
> > > >>>>>>>>>>>
> > > >>>>>>>>>> for
> > > >>>
> > > >>>> intermediate temp files until the end of the query execution.
> > > >>>>>>>>>>> If you can share names of a couple of temp files (and their
> > > sizes
> > > >>>>>>>>>>> along
> > > >>>>>>>>>>> with the sort memory setting you have in
> > > >>>>>>>>>>>
> > > >>>>>>>>>> asterix-configuration.xml)
> > > >>>
> > > >>>> we
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> may
> > > >>>>>>>>>>>
> > > >>>>>>>>>> be able to have a better guess as if the sort is really
> going
> > > >>>>>>>>>>
> > > >>>>>>>>> into a
> > > >>>
> > > >>>> two-level merge or not.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Pouria
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> On Tue, Aug 23, 2016 at 11:09 AM, Ian Maxon <
> imaxon@uci.edu>
> > > >>>>>>>>>>>
> > > >>>>>>>>>> wrote:
> > > >>>>
> > > >>>>> I think that execption ("No space left on device") is just casted
> > > >>>>>>>>>>> from
> > > >>>>>>>>>>> the
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> native IOException. Therefore I would be inclined to
> believe
> > > it's
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> genuinely
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> out of space. I suppose the question is why the external
> sort
> > > is
> > > >>>>>>>>>>>
> > > >>>>>>>>>> so
> > > >>>
> > > >>>> huge.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> What is the query plan? Maybe that will shed light on a
> > > possible
> > > >>>>>>>>>> cause.
> > > >>>>>>>>>>
> > > >>>>>>>>>> On Tue, Aug 23, 2016 at 9:59 AM, Wail Alkowaileet <
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> wael.y.k@gmail.com
> > > >>>>>>>>>>>> wrote:
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> I was monitoring Inodes ... it didn't go beyond 1%.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> On Tue, Aug 23, 2016 at 7:58 PM, Wail Alkowaileet <
> > > >>>>>>>>>>>>> wael.y.k@gmail.com
> > > >>>>>>>>>>>>> wrote:
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> Hi Chris and Mike,
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> Actually I was monitoring it to see what's going on:
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>      - The size of each partition is about 40GB (80GB in
> > > total
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>> per
> > > >>>>
> > > >>>>>      iodevice).
> > > >>>>>>>>>>>>>>      - The runs took 157GB per iodevice (about 2x of the
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>> dataset
> > > >>>
> > > >>>> size).
> > > >>>>>>>>>>>>>>      Each run takes either of 128MB or 96MB of storage.
> > > >>>>>>>>>>>>>>      - At a certain time, there were 522 runs.
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> I even tried to create a BTree Index to see if that
> > happens
> > > as
> > > >>>>>>>>>>>>>> well.
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> I
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>> created two BTree indexes one for the *location* and one
> > for
> > > >>>>>>>>>>>> the
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> *caller
> > > >>>>>>>>>>>>> *and
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> they were created successfully. The sizes of the runs
> > didn't
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>> take
> > > >>>
> > > >>>> anyway
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>> near that.
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> Logs are attached.
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> On Tue, Aug 23, 2016 at 7:19 PM, Mike Carey <
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>> dtabass@gmail.com>
> > > >>>
> > > >>>> wrote:
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>> I think we might have "file GC issues" - I vaguely
> remember
> > > >>>>>>>>>>>> that
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> we
> > > >>>>
> > > >>>>> don't
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> (or at least didn't once upon a time) proactively remove
> > > >>>>>>>>>>>>>> unnecessary
> > > >>>>>>>>>>>>>> run
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> files - removing all of them at end-of-job instead of at
> > the
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>> end
> > > >>>
> > > >>>> of
> > > >>>>
> > > >>>>> the
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> execution phase that uses their contents.  We may also
> > have
> > > an
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> "Amdahl
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> problem" right now with our sort since we serialize
> phase
> > > two
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>> of
> > > >>>
> > > >>>> parallel
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> sorts - though this is not a query, it's index build, so
> > > that
> > > >>>>>>>>>>>>>> shouldn't
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> be
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> it.  It would be interesting to put a df/sleep script on
> > each
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>> of
> > > >>>
> > > >>>> the
> > > >>>>>>>>>>>>>> nodes
> > > >>>>>>>>>>>>>> when this is happening - actually a script that monitors
> > the
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>> temp
> > > >>>>
> > > >>>>> file
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> directory - and watch the lifecycle happen and the sizes
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>> change....
> > > >>>>
> > > >>>>> On 8/23/16 2:06 AM, Chris Hillery wrote:
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> When you get the "disk full" warning, do a quick "df
> -i"
> > on
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> the
> > > >>>
> > > >>>> device
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> -
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>> possibly you've run out of inodes even if the space isn't
> > all
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>> used
> > > >>>>
> > > >>>>> up.
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> It's
> > > >>>>>>>>>>>>>> unlikely because I don't think AsterixDB creates a bunch
> > of
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>> small
> > > >>>>
> > > >>>>> files,
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> but worth checking.
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> If that's not it, then can you share the full exception
> > and
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> stack
> > > >>>>
> > > >>>>> trace?
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> Ceej
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> aka Chris Hillery
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> On Tue, Aug 23, 2016 at 1:59 AM, Wail Alkowaileet <
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> wael.y.k@gmail.com>
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> wrote:
> > > >>>>>>>>>>>>>> I just cleared the hard drives to get 80% free space. I
> > > still
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>> get
> > > >>>>
> > > >>>>> the
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> same
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>> issue.
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> The data contains:
> > > >>>>>>>>>>>>>>>>> 1- 2887453794 records.
> > > >>>>>>>>>>>>>>>>> 2- Schema:
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> create type CDRType as {
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> id:uuid,
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> 'date':string,
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> 'time':string,
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> 'duration':int64,
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> 'caller':int64,
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> 'callee':int64,
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> location:point?
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> }
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> On Tue, Aug 23, 2016 at 9:06 AM, Wail Alkowaileet <
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> wael.y.k@gmail.com
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> wrote:
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> Dears,
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> I have a dataset of size 290GB loaded in a 3 NCs each
> of
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> which
> > > >>>>
> > > >>>>> has
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> 2x500GB
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> SSD.
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> Each of NC has two IODevices (partitions) in each hard
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> drive
> > > >>>
> > > >>>> (i.e
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>> the
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> total is 4 iodevices per NC). After loading the data,
> > > each
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> Asterix
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> partition occupied 31GB.
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> The cluster has about 50% free space in each hard drive
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> (approximately
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> about 250GB free space in each hard drive). However,
> > > when I
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> tried
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> to
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> create
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> an index of type RTree, I got an exception that no
> space
> > > left
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>> in
> > > >>>
> > > >>>> the
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> hard
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> drive during the External Sort phase.
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> Is that normal ?
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>> --
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>> *Regards,*
> > > >>>>>>>>>>>>>>>>>> Wail Alkowaileet
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>> --
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>> *Regards,*
> > > >>>>>>>>>>>>>>>>> Wail Alkowaileet
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> --
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> *Regards,*
> > > >>>>>>>>>>>>>> Wail Alkowaileet
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> --
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>> *Regards,*
> > > >>>>>>>>>>>>> Wail Alkowaileet
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> --
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>> *Regards,*
> > > >>>>>>>>>> Wail Alkowaileet
> > > >>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>>> Best,
> > > >>>>>>>>>
> > > >>>>>>>>> Jianfeng Jia
> > > >>>>>>>>> PhD Candidate of Computer Science
> > > >>>>>>>>> University of California, Irvine
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>> --
> > > >>>>
> > > >>>> *Regards,*
> > > >>>> Wail Alkowaileet
> > > >>>>
> > > >>>>
> > > >
> > >
> > >
> > > --
> > >
> > > *Regards,*
> > > Wail Alkowaileet
> > >
> >
>

Re: Creating RTree: no space left

Posted by Khurram Faraaz <kh...@gmail.com>.

@Pouria here is Uber trip data

https://github.com/fivethirtyeight/uber-tlc-foil-response

On Sep 16, 2016 1:21 AM, "Chen Li" <ch...@gmail.com> wrote:

> @Wail: as a use case related to selectivity, our current Cloudberry
> prototype doesn't benefit from R-tree when the user is analyzing the data
> for the entire US.  But we expect to have R-tree benefits when a user zooms
> into a small region.
>
> On Thu, Sep 15, 2016 at 8:25 AM, Wail Alkowaileet <wa...@gmail.com>
> wrote:
>
> > Hi Ahmed and Mike,
> >
> > @Ahmed
> > I actually did a small experiment where I loaded about 1/5 of the data
> (so
> > I can index it) and seems that the R-Tree was really useful for querying
> > small regions or neighborhoods.
> > I also tried the B-Tree and it was slower than a full scan.
> >
> > @Mike
> > Unfortunately, I cannot still even after anonymization :-)
> >
> >
> > On Wed, Sep 14, 2016 at 11:29 PM, Mike Carey <dt...@gmail.com> wrote:
> >
> > > Interesting point, so to speak.  @Wail, any chance you could post a
> > Google
> > > maps screenshot showing a visualization of the points in this dataset
> on
> > > the underlying geographic region?  (If the dataset is shareable in that
> > > anonymized form?)  I would think an R-tree would still be good for
> > > small-region geo queries - possibly shrinking the candidate object set
> > by a
> > > factor of 10,000 - so still useful - and we also do index-AND-ing now,
> so
> > > we would also combine that shrinkage by other index-provided shrinkage
> on
> > > any other index-amenable predicates.  I think the queries are still
> > spatial
> > > in nature, and the only AsterixDB choices for that are R-tree.  (We did
> > > experiments with things like Hilbert B-trees, but the results led to
> the
> > > conclusion that the code base only needs R-trees for spatial data for
> the
> > > forseeable future - they just work too well and in a no-tuning-required
> > > fashion.... :-))
> > >
> > >
> > >
> > > On 9/14/16 12:49 PM, Ahmed Eldawy wrote:
> > >
> > >> Looks like an interesting case. Just a small question. Are you sure a
> > >> spatial index is the right one to use here? The spatial attribute
> looks
> > >> more like a categorization and a hash or B-tree index could be more
> > >> suitable. As far as I know, the spatial index in AsterixDB is a
> > secondary
> > >> R-tree index which, like any other secondary index, is only good for
> > >> retrieving a small number of records. For this dataset, it seems that
> > any
> > >> small range would still return a huge number of records.
> > >>
> > >> It is still interesting to further investigate and fix the sort issue
> > but
> > >> I
> > >> mentioned the usage issue for a different perspective.
> > >>
> > >> Thanks
> > >> Ahmed
> > >>
> > >> On Wed, Sep 14, 2016 at 10:30 AM Mike Carey <dt...@gmail.com>
> wrote:
> > >>
> > >> ☺!
> > >>>
> > >>> On Sep 14, 2016 1:11 AM, "Wail Alkowaileet" <wa...@gmail.com>
> > wrote:
> > >>>
> > >>> To be exact
> > >>>> I have 2,255,091,590 records and 10,391 points :-)
> > >>>>
> > >>>> On Wed, Sep 14, 2016 at 10:46 AM, Mike Carey <dt...@gmail.com>
> > wrote:
> > >>>>
> > >>>> Thx!  I knew I'd meant to "activate" the thought somehow, but
> couldn't
> > >>>>> remember having done it for sure.  Oops! Scattered from VLDB, I
> > >>>>>
> > >>>> guess...!
> > >>>
> > >>>>
> > >>>>>
> > >>>>> On 9/13/16 9:58 PM, Taewoo Kim wrote:
> > >>>>>
> > >>>>> @Mike: You filed an issue -
> > >>>>>> https://issues.apache.org/jira/browse/ASTERIXDB-1639. :-)
> > >>>>>>
> > >>>>>> Best,
> > >>>>>> Taewoo
> > >>>>>>
> > >>>>>> On Tue, Sep 13, 2016 at 9:28 PM, Mike Carey <dt...@gmail.com>
> > >>>>>>
> > >>>>> wrote:
> > >>>
> > >>>> I can't remember (slight jetlag? :-)) if I shared back to this list
> > >>>>>>
> > >>>>> one
> > >>>
> > >>>> theory that came up in India when Wail and I talked F2F - his data
> > >>>>>>>
> > >>>>>> has
> > >>>
> > >>>> a
> > >>>>
> > >>>>> lot of duplicate points, so maybe something goes awry in that case.
> > >>>>>>>
> > >>>>>> I
> > >>>
> > >>>> wonder if we've sufficiently tested that case?  (E.g., what if there
> > >>>>>>>
> > >>>>>> are
> > >>>>
> > >>>>> gazillions of records originating from a small handful of points?)
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> On 8/26/16 9:55 AM, Taewoo Kim wrote:
> > >>>>>>>
> > >>>>>>> Based on a rough calculation, per partition, each point field
> takes
> > >>>>>>>
> > >>>>>> 3.6GB
> > >>>>
> > >>>>> (16 bytes * 2887453794 records / 12 partition). To sort 3.6GB, we
> > >>>>>>>>
> > >>>>>>> are
> > >>>
> > >>>> generating 625 files (96MB or 128MB each) = 157GB. Since Wail
> > >>>>>>>>
> > >>>>>>> mentioned
> > >>>>
> > >>>>> that there was no issue when creating a B+ tree index, we need to
> > >>>>>>>>
> > >>>>>>> check
> > >>>>
> > >>>>> what SORT process is required by R-Tree index.
> > >>>>>>>>
> > >>>>>>>> Best,
> > >>>>>>>> Taewoo
> > >>>>>>>>
> > >>>>>>>> On Fri, Aug 26, 2016 at 7:52 AM, Jianfeng Jia <
> > >>>>>>>>
> > >>>>>>> jianfeng.jia@gmail.com
> > >>>
> > >>>> wrote:
> > >>>>>>>>
> > >>>>>>>> If all of the file names start with “ExternalSortRunGenerator”,
> > then
> > >>>>>>>> they
> > >>>>>>>>
> > >>>>>>>> are the first round files which can not be GCed.
> > >>>>>>>>> Could you provide the query plan as well?
> > >>>>>>>>>
> > >>>>>>>>> On Aug 24, 2016, at 10:02 PM, Wail Alkowaileet <
> > wael.y.k@gmail.com
> > >>>>>>>>> wrote:
> > >>>>>>>>>
> > >>>>>>>>> Hi Ian and Pouria,
> > >>>>>>>>>
> > >>>>>>>>>> The name of the files along with the sizes (there were 625 one
> > of
> > >>>>>>>>>> those
> > >>>>>>>>>> before crashing):
> > >>>>>>>>>>
> > >>>>>>>>>> size        name
> > >>>>>>>>>> 96MB     ExternalSortRunGenerator8917133039835449370.waf
> > >>>>>>>>>> 128MB   ExternalSortRunGenerator8948724728025392343.waf
> > >>>>>>>>>>
> > >>>>>>>>>> no files were generated beyond runs.
> > >>>>>>>>>> compiler.sortmemory = 64MB
> > >>>>>>>>>>
> > >>>>>>>>>> Here is the full logs
> > >>>>>>>>>> <https://www.dropbox.com/s/k2qbo3wybc8mnnk/log_Thu_Aug_
> > >>>>>>>>>>
> > >>>>>>>>>> 25_07%3A34%3A52_AST_2016.zip?dl=0>
> > >>>>>>>>>>
> > >>>>>>>>> On Tue, Aug 23, 2016 at 9:29 PM, Pouria Pirzadeh <
> > >>>>>>>>>
> > >>>>>>>>>> pouria.pirzadeh@gmail.com>
> > >>>>>>>>>>
> > >>>>>>>>> wrote:
> > >>>>>>>>>
> > >>>>>>>>>> We previously had issues with huge spilled sort temp files
> when
> > >>>>>>>>>> creating
> > >>>>>>>>>>
> > >>>>>>>>>> inverted index for fuzzy queries, but NOT R-Trees.
> > >>>>>>>>>>> I also recall that Yingyi fixed the issue of delaying
> clean-up
> > >>>>>>>>>>>
> > >>>>>>>>>> for
> > >>>
> > >>>> intermediate temp files until the end of the query execution.
> > >>>>>>>>>>> If you can share names of a couple of temp files (and their
> > sizes
> > >>>>>>>>>>> along
> > >>>>>>>>>>> with the sort memory setting you have in
> > >>>>>>>>>>>
> > >>>>>>>>>> asterix-configuration.xml)
> > >>>
> > >>>> we
> > >>>>>>>>>>>
> > >>>>>>>>>>> may
> > >>>>>>>>>>>
> > >>>>>>>>>> be able to have a better guess as if the sort is really going
> > >>>>>>>>>>
> > >>>>>>>>> into a
> > >>>
> > >>>> two-level merge or not.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Pouria
> > >>>>>>>>>>>
> > >>>>>>>>>>> On Tue, Aug 23, 2016 at 11:09 AM, Ian Maxon <im...@uci.edu>
> > >>>>>>>>>>>
> > >>>>>>>>>> wrote:
> > >>>>
> > >>>>> I think that execption ("No space left on device") is just casted
> > >>>>>>>>>>> from
> > >>>>>>>>>>> the
> > >>>>>>>>>>>
> > >>>>>>>>>>> native IOException. Therefore I would be inclined to believe
> > it's
> > >>>>>>>>>>>
> > >>>>>>>>>>>> genuinely
> > >>>>>>>>>>>>
> > >>>>>>>>>>> out of space. I suppose the question is why the external sort
> > is
> > >>>>>>>>>>>
> > >>>>>>>>>> so
> > >>>
> > >>>> huge.
> > >>>>>>>>>>>>
> > >>>>>>>>>>> What is the query plan? Maybe that will shed light on a
> > possible
> > >>>>>>>>>> cause.
> > >>>>>>>>>>
> > >>>>>>>>>> On Tue, Aug 23, 2016 at 9:59 AM, Wail Alkowaileet <
> > >>>>>>>>>>>
> > >>>>>>>>>>>> wael.y.k@gmail.com
> > >>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> I was monitoring Inodes ... it didn't go beyond 1%.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> On Tue, Aug 23, 2016 at 7:58 PM, Wail Alkowaileet <
> > >>>>>>>>>>>>> wael.y.k@gmail.com
> > >>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Hi Chris and Mike,
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Actually I was monitoring it to see what's going on:
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>      - The size of each partition is about 40GB (80GB in
> > total
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>> per
> > >>>>
> > >>>>>      iodevice).
> > >>>>>>>>>>>>>>      - The runs took 157GB per iodevice (about 2x of the
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>> dataset
> > >>>
> > >>>> size).
> > >>>>>>>>>>>>>>      Each run takes either of 128MB or 96MB of storage.
> > >>>>>>>>>>>>>>      - At a certain time, there were 522 runs.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> I even tried to create a BTree Index to see if that
> happens
> > as
> > >>>>>>>>>>>>>> well.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> I
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>> created two BTree indexes one for the *location* and one
> for
> > >>>>>>>>>>>> the
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> *caller
> > >>>>>>>>>>>>> *and
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> they were created successfully. The sizes of the runs
> didn't
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>> take
> > >>>
> > >>>> anyway
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>> near that.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Logs are attached.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> On Tue, Aug 23, 2016 at 7:19 PM, Mike Carey <
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>> dtabass@gmail.com>
> > >>>
> > >>>> wrote:
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>> I think we might have "file GC issues" - I vaguely remember
> > >>>>>>>>>>>> that
> > >>>>>>>>>>>>
> > >>>>>>>>>>> we
> > >>>>
> > >>>>> don't
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>> (or at least didn't once upon a time) proactively remove
> > >>>>>>>>>>>>>> unnecessary
> > >>>>>>>>>>>>>> run
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> files - removing all of them at end-of-job instead of at
> the
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>> end
> > >>>
> > >>>> of
> > >>>>
> > >>>>> the
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> execution phase that uses their contents.  We may also
> have
> > an
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> "Amdahl
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> problem" right now with our sort since we serialize phase
> > two
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>> of
> > >>>
> > >>>> parallel
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>> sorts - though this is not a query, it's index build, so
> > that
> > >>>>>>>>>>>>>> shouldn't
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> be
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> it.  It would be interesting to put a df/sleep script on
> each
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>> of
> > >>>
> > >>>> the
> > >>>>>>>>>>>>>> nodes
> > >>>>>>>>>>>>>> when this is happening - actually a script that monitors
> the
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>> temp
> > >>>>
> > >>>>> file
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> directory - and watch the lifecycle happen and the sizes
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>> change....
> > >>>>
> > >>>>> On 8/23/16 2:06 AM, Chris Hillery wrote:
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> When you get the "disk full" warning, do a quick "df -i"
> on
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> the
> > >>>
> > >>>> device
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> -
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>> possibly you've run out of inodes even if the space isn't
> all
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>> used
> > >>>>
> > >>>>> up.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> It's
> > >>>>>>>>>>>>>> unlikely because I don't think AsterixDB creates a bunch
> of
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>> small
> > >>>>
> > >>>>> files,
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> but worth checking.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> If that's not it, then can you share the full exception
> and
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> stack
> > >>>>
> > >>>>> trace?
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Ceej
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> aka Chris Hillery
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> On Tue, Aug 23, 2016 at 1:59 AM, Wail Alkowaileet <
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> wael.y.k@gmail.com>
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>> I just cleared the hard drives to get 80% free space. I
> > still
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>> get
> > >>>>
> > >>>>> the
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> same
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>> issue.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>> The data contains:
> > >>>>>>>>>>>>>>>>> 1- 2887453794 records.
> > >>>>>>>>>>>>>>>>> 2- Schema:
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> create type CDRType as {
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> id:uuid,
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> 'date':string,
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> 'time':string,
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> 'duration':int64,
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> 'caller':int64,
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> 'callee':int64,
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> location:point?
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> }
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> On Tue, Aug 23, 2016 at 9:06 AM, Wail Alkowaileet <
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> wael.y.k@gmail.com
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Dears,
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> I have a dataset of size 290GB loaded in a 3 NCs each of
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> which
> > >>>>
> > >>>>> has
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> 2x500GB
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> SSD.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Each of NC has two IODevices (partitions) in each hard
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> drive
> > >>>
> > >>>> (i.e
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> the
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> total is 4 iodevices per NC). After loading the data,
> > each
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Asterix
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> partition occupied 31GB.
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> The cluster has about 50% free space in each hard drive
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>> (approximately
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> about 250GB free space in each hard drive). However,
> > when I
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> tried
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> to
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> create
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> an index of type RTree, I got an exception that no space
> > left
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>> in
> > >>>
> > >>>> the
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> hard
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> drive during the External Sort phase.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Is that normal ?
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> --
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> *Regards,*
> > >>>>>>>>>>>>>>>>>> Wail Alkowaileet
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> --
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> *Regards,*
> > >>>>>>>>>>>>>>>>> Wail Alkowaileet
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> --
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> *Regards,*
> > >>>>>>>>>>>>>> Wail Alkowaileet
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> --
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>> *Regards,*
> > >>>>>>>>>>>>> Wail Alkowaileet
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> --
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>> *Regards,*
> > >>>>>>>>>> Wail Alkowaileet
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>> Best,
> > >>>>>>>>>
> > >>>>>>>>> Jianfeng Jia
> > >>>>>>>>> PhD Candidate of Computer Science
> > >>>>>>>>> University of California, Irvine
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>> --
> > >>>>
> > >>>> *Regards,*
> > >>>> Wail Alkowaileet
> > >>>>
> > >>>>
> > >
> >
> >
> > --
> >
> > *Regards,*
> > Wail Alkowaileet
> >
>

Re: Creating RTree: no space left

Posted by Chen Li <ch...@gmail.com>.

@Wail: as a use case related to selectivity, our current Cloudberry
prototype doesn't benefit from R-tree when the user is analyzing the data
for the entire US.  But we expect to have R-tree benefits when a user zooms
into a small region.

On Thu, Sep 15, 2016 at 8:25 AM, Wail Alkowaileet <wa...@gmail.com>
wrote:

> Hi Ahmed and Mike,
>
> @Ahmed
> I actually did a small experiment where I loaded about 1/5 of the data (so
> I can index it) and seems that the R-Tree was really useful for querying
> small regions or neighborhoods.
> I also tried the B-Tree and it was slower than a full scan.
>
> @Mike
> Unfortunately, I cannot still even after anonymization :-)
>
>
> On Wed, Sep 14, 2016 at 11:29 PM, Mike Carey <dt...@gmail.com> wrote:
>
> > Interesting point, so to speak.  @Wail, any chance you could post a
> Google
> > maps screenshot showing a visualization of the points in this dataset on
> > the underlying geographic region?  (If the dataset is shareable in that
> > anonymized form?)  I would think an R-tree would still be good for
> > small-region geo queries - possibly shrinking the candidate object set
> by a
> > factor of 10,000 - so still useful - and we also do index-AND-ing now, so
> > we would also combine that shrinkage by other index-provided shrinkage on
> > any other index-amenable predicates.  I think the queries are still
> spatial
> > in nature, and the only AsterixDB choices for that are R-tree.  (We did
> > experiments with things like Hilbert B-trees, but the results led to the
> > conclusion that the code base only needs R-trees for spatial data for the
> > forseeable future - they just work too well and in a no-tuning-required
> > fashion.... :-))
> >
> >
> >
> > On 9/14/16 12:49 PM, Ahmed Eldawy wrote:
> >
> >> Looks like an interesting case. Just a small question. Are you sure a
> >> spatial index is the right one to use here? The spatial attribute looks
> >> more like a categorization and a hash or B-tree index could be more
> >> suitable. As far as I know, the spatial index in AsterixDB is a
> secondary
> >> R-tree index which, like any other secondary index, is only good for
> >> retrieving a small number of records. For this dataset, it seems that
> any
> >> small range would still return a huge number of records.
> >>
> >> It is still interesting to further investigate and fix the sort issue
> but
> >> I
> >> mentioned the usage issue for a different perspective.
> >>
> >> Thanks
> >> Ahmed
> >>
> >> On Wed, Sep 14, 2016 at 10:30 AM Mike Carey <dt...@gmail.com> wrote:
> >>
> >> ☺!
> >>>
> >>> On Sep 14, 2016 1:11 AM, "Wail Alkowaileet" <wa...@gmail.com>
> wrote:
> >>>
> >>> To be exact
> >>>> I have 2,255,091,590 records and 10,391 points :-)
> >>>>
> >>>> On Wed, Sep 14, 2016 at 10:46 AM, Mike Carey <dt...@gmail.com>
> wrote:
> >>>>
> >>>> Thx!  I knew I'd meant to "activate" the thought somehow, but couldn't
> >>>>> remember having done it for sure.  Oops! Scattered from VLDB, I
> >>>>>
> >>>> guess...!
> >>>
> >>>>
> >>>>>
> >>>>> On 9/13/16 9:58 PM, Taewoo Kim wrote:
> >>>>>
> >>>>> @Mike: You filed an issue -
> >>>>>> https://issues.apache.org/jira/browse/ASTERIXDB-1639. :-)
> >>>>>>
> >>>>>> Best,
> >>>>>> Taewoo
> >>>>>>
> >>>>>> On Tue, Sep 13, 2016 at 9:28 PM, Mike Carey <dt...@gmail.com>
> >>>>>>
> >>>>> wrote:
> >>>
> >>>> I can't remember (slight jetlag? :-)) if I shared back to this list
> >>>>>>
> >>>>> one
> >>>
> >>>> theory that came up in India when Wail and I talked F2F - his data
> >>>>>>>
> >>>>>> has
> >>>
> >>>> a
> >>>>
> >>>>> lot of duplicate points, so maybe something goes awry in that case.
> >>>>>>>
> >>>>>> I
> >>>
> >>>> wonder if we've sufficiently tested that case?  (E.g., what if there
> >>>>>>>
> >>>>>> are
> >>>>
> >>>>> gazillions of records originating from a small handful of points?)
> >>>>>>>
> >>>>>>>
> >>>>>>> On 8/26/16 9:55 AM, Taewoo Kim wrote:
> >>>>>>>
> >>>>>>> Based on a rough calculation, per partition, each point field takes
> >>>>>>>
> >>>>>> 3.6GB
> >>>>
> >>>>> (16 bytes * 2887453794 records / 12 partition). To sort 3.6GB, we
> >>>>>>>>
> >>>>>>> are
> >>>
> >>>> generating 625 files (96MB or 128MB each) = 157GB. Since Wail
> >>>>>>>>
> >>>>>>> mentioned
> >>>>
> >>>>> that there was no issue when creating a B+ tree index, we need to
> >>>>>>>>
> >>>>>>> check
> >>>>
> >>>>> what SORT process is required by R-Tree index.
> >>>>>>>>
> >>>>>>>> Best,
> >>>>>>>> Taewoo
> >>>>>>>>
> >>>>>>>> On Fri, Aug 26, 2016 at 7:52 AM, Jianfeng Jia <
> >>>>>>>>
> >>>>>>> jianfeng.jia@gmail.com
> >>>
> >>>> wrote:
> >>>>>>>>
> >>>>>>>> If all of the file names start with “ExternalSortRunGenerator”,
> then
> >>>>>>>> they
> >>>>>>>>
> >>>>>>>> are the first round files which can not be GCed.
> >>>>>>>>> Could you provide the query plan as well?
> >>>>>>>>>
> >>>>>>>>> On Aug 24, 2016, at 10:02 PM, Wail Alkowaileet <
> wael.y.k@gmail.com
> >>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>> Hi Ian and Pouria,
> >>>>>>>>>
> >>>>>>>>>> The name of the files along with the sizes (there were 625 one
> of
> >>>>>>>>>> those
> >>>>>>>>>> before crashing):
> >>>>>>>>>>
> >>>>>>>>>> size        name
> >>>>>>>>>> 96MB     ExternalSortRunGenerator8917133039835449370.waf
> >>>>>>>>>> 128MB   ExternalSortRunGenerator8948724728025392343.waf
> >>>>>>>>>>
> >>>>>>>>>> no files were generated beyond runs.
> >>>>>>>>>> compiler.sortmemory = 64MB
> >>>>>>>>>>
> >>>>>>>>>> Here is the full logs
> >>>>>>>>>> <https://www.dropbox.com/s/k2qbo3wybc8mnnk/log_Thu_Aug_
> >>>>>>>>>>
> >>>>>>>>>> 25_07%3A34%3A52_AST_2016.zip?dl=0>
> >>>>>>>>>>
> >>>>>>>>> On Tue, Aug 23, 2016 at 9:29 PM, Pouria Pirzadeh <
> >>>>>>>>>
> >>>>>>>>>> pouria.pirzadeh@gmail.com>
> >>>>>>>>>>
> >>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> We previously had issues with huge spilled sort temp files when
> >>>>>>>>>> creating
> >>>>>>>>>>
> >>>>>>>>>> inverted index for fuzzy queries, but NOT R-Trees.
> >>>>>>>>>>> I also recall that Yingyi fixed the issue of delaying clean-up
> >>>>>>>>>>>
> >>>>>>>>>> for
> >>>
> >>>> intermediate temp files until the end of the query execution.
> >>>>>>>>>>> If you can share names of a couple of temp files (and their
> sizes
> >>>>>>>>>>> along
> >>>>>>>>>>> with the sort memory setting you have in
> >>>>>>>>>>>
> >>>>>>>>>> asterix-configuration.xml)
> >>>
> >>>> we
> >>>>>>>>>>>
> >>>>>>>>>>> may
> >>>>>>>>>>>
> >>>>>>>>>> be able to have a better guess as if the sort is really going
> >>>>>>>>>>
> >>>>>>>>> into a
> >>>
> >>>> two-level merge or not.
> >>>>>>>>>>>
> >>>>>>>>>>> Pouria
> >>>>>>>>>>>
> >>>>>>>>>>> On Tue, Aug 23, 2016 at 11:09 AM, Ian Maxon <im...@uci.edu>
> >>>>>>>>>>>
> >>>>>>>>>> wrote:
> >>>>
> >>>>> I think that execption ("No space left on device") is just casted
> >>>>>>>>>>> from
> >>>>>>>>>>> the
> >>>>>>>>>>>
> >>>>>>>>>>> native IOException. Therefore I would be inclined to believe
> it's
> >>>>>>>>>>>
> >>>>>>>>>>>> genuinely
> >>>>>>>>>>>>
> >>>>>>>>>>> out of space. I suppose the question is why the external sort
> is
> >>>>>>>>>>>
> >>>>>>>>>> so
> >>>
> >>>> huge.
> >>>>>>>>>>>>
> >>>>>>>>>>> What is the query plan? Maybe that will shed light on a
> possible
> >>>>>>>>>> cause.
> >>>>>>>>>>
> >>>>>>>>>> On Tue, Aug 23, 2016 at 9:59 AM, Wail Alkowaileet <
> >>>>>>>>>>>
> >>>>>>>>>>>> wael.y.k@gmail.com
> >>>>>>>>>>>> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>> I was monitoring Inodes ... it didn't go beyond 1%.
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Tue, Aug 23, 2016 at 7:58 PM, Wail Alkowaileet <
> >>>>>>>>>>>>> wael.y.k@gmail.com
> >>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Hi Chris and Mike,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Actually I was monitoring it to see what's going on:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>      - The size of each partition is about 40GB (80GB in
> total
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> per
> >>>>
> >>>>>      iodevice).
> >>>>>>>>>>>>>>      - The runs took 157GB per iodevice (about 2x of the
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> dataset
> >>>
> >>>> size).
> >>>>>>>>>>>>>>      Each run takes either of 128MB or 96MB of storage.
> >>>>>>>>>>>>>>      - At a certain time, there were 522 runs.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I even tried to create a BTree Index to see if that happens
> as
> >>>>>>>>>>>>>> well.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> created two BTree indexes one for the *location* and one for
> >>>>>>>>>>>> the
> >>>>>>>>>>>>
> >>>>>>>>>>>> *caller
> >>>>>>>>>>>>> *and
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> they were created successfully. The sizes of the runs didn't
> >>>>>>>>>>>>>
> >>>>>>>>>>>> take
> >>>
> >>>> anyway
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> near that.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Logs are attached.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On Tue, Aug 23, 2016 at 7:19 PM, Mike Carey <
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> dtabass@gmail.com>
> >>>
> >>>> wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> I think we might have "file GC issues" - I vaguely remember
> >>>>>>>>>>>> that
> >>>>>>>>>>>>
> >>>>>>>>>>> we
> >>>>
> >>>>> don't
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> (or at least didn't once upon a time) proactively remove
> >>>>>>>>>>>>>> unnecessary
> >>>>>>>>>>>>>> run
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> files - removing all of them at end-of-job instead of at the
> >>>>>>>>>>>>>
> >>>>>>>>>>>> end
> >>>
> >>>> of
> >>>>
> >>>>> the
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> execution phase that uses their contents.  We may also have
> an
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> "Amdahl
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> problem" right now with our sort since we serialize phase
> two
> >>>>>>>>>>>>>
> >>>>>>>>>>>> of
> >>>
> >>>> parallel
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> sorts - though this is not a query, it's index build, so
> that
> >>>>>>>>>>>>>> shouldn't
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> be
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> it.  It would be interesting to put a df/sleep script on each
> >>>>>>>>>>>>>
> >>>>>>>>>>>> of
> >>>
> >>>> the
> >>>>>>>>>>>>>> nodes
> >>>>>>>>>>>>>> when this is happening - actually a script that monitors the
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> temp
> >>>>
> >>>>> file
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> directory - and watch the lifecycle happen and the sizes
> >>>>>>>>>>>>>
> >>>>>>>>>>>> change....
> >>>>
> >>>>> On 8/23/16 2:06 AM, Chris Hillery wrote:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> When you get the "disk full" warning, do a quick "df -i" on
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>> the
> >>>
> >>>> device
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> -
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> possibly you've run out of inodes even if the space isn't all
> >>>>>>>>>>>>>
> >>>>>>>>>>>> used
> >>>>
> >>>>> up.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> It's
> >>>>>>>>>>>>>> unlikely because I don't think AsterixDB creates a bunch of
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> small
> >>>>
> >>>>> files,
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> but worth checking.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> If that's not it, then can you share the full exception and
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>> stack
> >>>>
> >>>>> trace?
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Ceej
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> aka Chris Hillery
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> On Tue, Aug 23, 2016 at 1:59 AM, Wail Alkowaileet <
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> wael.y.k@gmail.com>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>> I just cleared the hard drives to get 80% free space. I
> still
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> get
> >>>>
> >>>>> the
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> same
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> issue.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> The data contains:
> >>>>>>>>>>>>>>>>> 1- 2887453794 records.
> >>>>>>>>>>>>>>>>> 2- Schema:
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> create type CDRType as {
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> id:uuid,
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> 'date':string,
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> 'time':string,
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> 'duration':int64,
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> 'caller':int64,
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> 'callee':int64,
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> location:point?
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> }
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> On Tue, Aug 23, 2016 at 9:06 AM, Wail Alkowaileet <
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> wael.y.k@gmail.com
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Dears,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I have a dataset of size 290GB loaded in a 3 NCs each of
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> which
> >>>>
> >>>>> has
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> 2x500GB
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> SSD.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> Each of NC has two IODevices (partitions) in each hard
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> drive
> >>>
> >>>> (i.e
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> total is 4 iodevices per NC). After loading the data,
> each
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Asterix
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> partition occupied 31GB.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> The cluster has about 50% free space in each hard drive
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> (approximately
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> about 250GB free space in each hard drive). However,
> when I
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> tried
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> create
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> an index of type RTree, I got an exception that no space
> left
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> in
> >>>
> >>>> the
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> hard
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> drive during the External Sort phase.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Is that normal ?
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> *Regards,*
> >>>>>>>>>>>>>>>>>> Wail Alkowaileet
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> *Regards,*
> >>>>>>>>>>>>>>>>> Wail Alkowaileet
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> *Regards,*
> >>>>>>>>>>>>>> Wail Alkowaileet
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> *Regards,*
> >>>>>>>>>>>>> Wail Alkowaileet
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> --
> >>>>>>>>>>>>>
> >>>>>>>>>>>> *Regards,*
> >>>>>>>>>> Wail Alkowaileet
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Best,
> >>>>>>>>>
> >>>>>>>>> Jianfeng Jia
> >>>>>>>>> PhD Candidate of Computer Science
> >>>>>>>>> University of California, Irvine
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>> --
> >>>>
> >>>> *Regards,*
> >>>> Wail Alkowaileet
> >>>>
> >>>>
> >
>
>
> --
>
> *Regards,*
> Wail Alkowaileet
>

Re: Creating RTree: no space left

Posted by Pouria Pirzadeh <po...@gmail.com>.

@Wail
One quick question:
By any chance do you have some spatial data at a similar scale
(size/cardinality-wise) but with less (ideally without) duplicates ? I am
really curious to know if the core of your loading problem is because of
the size/setting that is being used or because of the duplicates ? (The
fact is we had successfully loaded data at scale into R-Tree indices
before, however with different settings i.e. the amount of data per NC and
IO Device was less, and definitely with much less duplicates (at least in
my experiments) ).

Thanks.
Pouria

On Thu, Sep 15, 2016 at 8:25 AM, Wail Alkowaileet <wa...@gmail.com>
wrote:

> Hi Ahmed and Mike,
>
> @Ahmed
> I actually did a small experiment where I loaded about 1/5 of the data (so
> I can index it) and seems that the R-Tree was really useful for querying
> small regions or neighborhoods.
> I also tried the B-Tree and it was slower than a full scan.
>
> @Mike
> Unfortunately, I cannot still even after anonymization :-)
>
>
> On Wed, Sep 14, 2016 at 11:29 PM, Mike Carey <dt...@gmail.com> wrote:
>
> > Interesting point, so to speak.  @Wail, any chance you could post a
> Google
> > maps screenshot showing a visualization of the points in this dataset on
> > the underlying geographic region?  (If the dataset is shareable in that
> > anonymized form?)  I would think an R-tree would still be good for
> > small-region geo queries - possibly shrinking the candidate object set
> by a
> > factor of 10,000 - so still useful - and we also do index-AND-ing now, so
> > we would also combine that shrinkage by other index-provided shrinkage on
> > any other index-amenable predicates.  I think the queries are still
> spatial
> > in nature, and the only AsterixDB choices for that are R-tree.  (We did
> > experiments with things like Hilbert B-trees, but the results led to the
> > conclusion that the code base only needs R-trees for spatial data for the
> > forseeable future - they just work too well and in a no-tuning-required
> > fashion.... :-))
> >
> >
> >
> > On 9/14/16 12:49 PM, Ahmed Eldawy wrote:
> >
> >> Looks like an interesting case. Just a small question. Are you sure a
> >> spatial index is the right one to use here? The spatial attribute looks
> >> more like a categorization and a hash or B-tree index could be more
> >> suitable. As far as I know, the spatial index in AsterixDB is a
> secondary
> >> R-tree index which, like any other secondary index, is only good for
> >> retrieving a small number of records. For this dataset, it seems that
> any
> >> small range would still return a huge number of records.
> >>
> >> It is still interesting to further investigate and fix the sort issue
> but
> >> I
> >> mentioned the usage issue for a different perspective.
> >>
> >> Thanks
> >> Ahmed
> >>
> >> On Wed, Sep 14, 2016 at 10:30 AM Mike Carey <dt...@gmail.com> wrote:
> >>
> >> ☺!
> >>>
> >>> On Sep 14, 2016 1:11 AM, "Wail Alkowaileet" <wa...@gmail.com>
> wrote:
> >>>
> >>> To be exact
> >>>> I have 2,255,091,590 records and 10,391 points :-)
> >>>>
> >>>> On Wed, Sep 14, 2016 at 10:46 AM, Mike Carey <dt...@gmail.com>
> wrote:
> >>>>
> >>>> Thx!  I knew I'd meant to "activate" the thought somehow, but couldn't
> >>>>> remember having done it for sure.  Oops! Scattered from VLDB, I
> >>>>>
> >>>> guess...!
> >>>
> >>>>
> >>>>>
> >>>>> On 9/13/16 9:58 PM, Taewoo Kim wrote:
> >>>>>
> >>>>> @Mike: You filed an issue -
> >>>>>> https://issues.apache.org/jira/browse/ASTERIXDB-1639. :-)
> >>>>>>
> >>>>>> Best,
> >>>>>> Taewoo
> >>>>>>
> >>>>>> On Tue, Sep 13, 2016 at 9:28 PM, Mike Carey <dt...@gmail.com>
> >>>>>>
> >>>>> wrote:
> >>>
> >>>> I can't remember (slight jetlag? :-)) if I shared back to this list
> >>>>>>
> >>>>> one
> >>>
> >>>> theory that came up in India when Wail and I talked F2F - his data
> >>>>>>>
> >>>>>> has
> >>>
> >>>> a
> >>>>
> >>>>> lot of duplicate points, so maybe something goes awry in that case.
> >>>>>>>
> >>>>>> I
> >>>
> >>>> wonder if we've sufficiently tested that case?  (E.g., what if there
> >>>>>>>
> >>>>>> are
> >>>>
> >>>>> gazillions of records originating from a small handful of points?)
> >>>>>>>
> >>>>>>>
> >>>>>>> On 8/26/16 9:55 AM, Taewoo Kim wrote:
> >>>>>>>
> >>>>>>> Based on a rough calculation, per partition, each point field takes
> >>>>>>>
> >>>>>> 3.6GB
> >>>>
> >>>>> (16 bytes * 2887453794 records / 12 partition). To sort 3.6GB, we
> >>>>>>>>
> >>>>>>> are
> >>>
> >>>> generating 625 files (96MB or 128MB each) = 157GB. Since Wail
> >>>>>>>>
> >>>>>>> mentioned
> >>>>
> >>>>> that there was no issue when creating a B+ tree index, we need to
> >>>>>>>>
> >>>>>>> check
> >>>>
> >>>>> what SORT process is required by R-Tree index.
> >>>>>>>>
> >>>>>>>> Best,
> >>>>>>>> Taewoo
> >>>>>>>>
> >>>>>>>> On Fri, Aug 26, 2016 at 7:52 AM, Jianfeng Jia <
> >>>>>>>>
> >>>>>>> jianfeng.jia@gmail.com
> >>>
> >>>> wrote:
> >>>>>>>>
> >>>>>>>> If all of the file names start with “ExternalSortRunGenerator”,
> then
> >>>>>>>> they
> >>>>>>>>
> >>>>>>>> are the first round files which can not be GCed.
> >>>>>>>>> Could you provide the query plan as well?
> >>>>>>>>>
> >>>>>>>>> On Aug 24, 2016, at 10:02 PM, Wail Alkowaileet <
> wael.y.k@gmail.com
> >>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>> Hi Ian and Pouria,
> >>>>>>>>>
> >>>>>>>>>> The name of the files along with the sizes (there were 625 one
> of
> >>>>>>>>>> those
> >>>>>>>>>> before crashing):
> >>>>>>>>>>
> >>>>>>>>>> size        name
> >>>>>>>>>> 96MB     ExternalSortRunGenerator8917133039835449370.waf
> >>>>>>>>>> 128MB   ExternalSortRunGenerator8948724728025392343.waf
> >>>>>>>>>>
> >>>>>>>>>> no files were generated beyond runs.
> >>>>>>>>>> compiler.sortmemory = 64MB
> >>>>>>>>>>
> >>>>>>>>>> Here is the full logs
> >>>>>>>>>> <https://www.dropbox.com/s/k2qbo3wybc8mnnk/log_Thu_Aug_
> >>>>>>>>>>
> >>>>>>>>>> 25_07%3A34%3A52_AST_2016.zip?dl=0>
> >>>>>>>>>>
> >>>>>>>>> On Tue, Aug 23, 2016 at 9:29 PM, Pouria Pirzadeh <
> >>>>>>>>>
> >>>>>>>>>> pouria.pirzadeh@gmail.com>
> >>>>>>>>>>
> >>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> We previously had issues with huge spilled sort temp files when
> >>>>>>>>>> creating
> >>>>>>>>>>
> >>>>>>>>>> inverted index for fuzzy queries, but NOT R-Trees.
> >>>>>>>>>>> I also recall that Yingyi fixed the issue of delaying clean-up
> >>>>>>>>>>>
> >>>>>>>>>> for
> >>>
> >>>> intermediate temp files until the end of the query execution.
> >>>>>>>>>>> If you can share names of a couple of temp files (and their
> sizes
> >>>>>>>>>>> along
> >>>>>>>>>>> with the sort memory setting you have in
> >>>>>>>>>>>
> >>>>>>>>>> asterix-configuration.xml)
> >>>
> >>>> we
> >>>>>>>>>>>
> >>>>>>>>>>> may
> >>>>>>>>>>>
> >>>>>>>>>> be able to have a better guess as if the sort is really going
> >>>>>>>>>>
> >>>>>>>>> into a
> >>>
> >>>> two-level merge or not.
> >>>>>>>>>>>
> >>>>>>>>>>> Pouria
> >>>>>>>>>>>
> >>>>>>>>>>> On Tue, Aug 23, 2016 at 11:09 AM, Ian Maxon <im...@uci.edu>
> >>>>>>>>>>>
> >>>>>>>>>> wrote:
> >>>>
> >>>>> I think that execption ("No space left on device") is just casted
> >>>>>>>>>>> from
> >>>>>>>>>>> the
> >>>>>>>>>>>
> >>>>>>>>>>> native IOException. Therefore I would be inclined to believe
> it's
> >>>>>>>>>>>
> >>>>>>>>>>>> genuinely
> >>>>>>>>>>>>
> >>>>>>>>>>> out of space. I suppose the question is why the external sort
> is
> >>>>>>>>>>>
> >>>>>>>>>> so
> >>>
> >>>> huge.
> >>>>>>>>>>>>
> >>>>>>>>>>> What is the query plan? Maybe that will shed light on a
> possible
> >>>>>>>>>> cause.
> >>>>>>>>>>
> >>>>>>>>>> On Tue, Aug 23, 2016 at 9:59 AM, Wail Alkowaileet <
> >>>>>>>>>>>
> >>>>>>>>>>>> wael.y.k@gmail.com
> >>>>>>>>>>>> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>> I was monitoring Inodes ... it didn't go beyond 1%.
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Tue, Aug 23, 2016 at 7:58 PM, Wail Alkowaileet <
> >>>>>>>>>>>>> wael.y.k@gmail.com
> >>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Hi Chris and Mike,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Actually I was monitoring it to see what's going on:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>      - The size of each partition is about 40GB (80GB in
> total
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> per
> >>>>
> >>>>>      iodevice).
> >>>>>>>>>>>>>>      - The runs took 157GB per iodevice (about 2x of the
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> dataset
> >>>
> >>>> size).
> >>>>>>>>>>>>>>      Each run takes either of 128MB or 96MB of storage.
> >>>>>>>>>>>>>>      - At a certain time, there were 522 runs.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I even tried to create a BTree Index to see if that happens
> as
> >>>>>>>>>>>>>> well.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> created two BTree indexes one for the *location* and one for
> >>>>>>>>>>>> the
> >>>>>>>>>>>>
> >>>>>>>>>>>> *caller
> >>>>>>>>>>>>> *and
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> they were created successfully. The sizes of the runs didn't
> >>>>>>>>>>>>>
> >>>>>>>>>>>> take
> >>>
> >>>> anyway
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> near that.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Logs are attached.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On Tue, Aug 23, 2016 at 7:19 PM, Mike Carey <
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> dtabass@gmail.com>
> >>>
> >>>> wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> I think we might have "file GC issues" - I vaguely remember
> >>>>>>>>>>>> that
> >>>>>>>>>>>>
> >>>>>>>>>>> we
> >>>>
> >>>>> don't
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> (or at least didn't once upon a time) proactively remove
> >>>>>>>>>>>>>> unnecessary
> >>>>>>>>>>>>>> run
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> files - removing all of them at end-of-job instead of at the
> >>>>>>>>>>>>>
> >>>>>>>>>>>> end
> >>>
> >>>> of
> >>>>
> >>>>> the
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> execution phase that uses their contents.  We may also have
> an
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> "Amdahl
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> problem" right now with our sort since we serialize phase
> two
> >>>>>>>>>>>>>
> >>>>>>>>>>>> of
> >>>
> >>>> parallel
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> sorts - though this is not a query, it's index build, so
> that
> >>>>>>>>>>>>>> shouldn't
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> be
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> it.  It would be interesting to put a df/sleep script on each
> >>>>>>>>>>>>>
> >>>>>>>>>>>> of
> >>>
> >>>> the
> >>>>>>>>>>>>>> nodes
> >>>>>>>>>>>>>> when this is happening - actually a script that monitors the
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> temp
> >>>>
> >>>>> file
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> directory - and watch the lifecycle happen and the sizes
> >>>>>>>>>>>>>
> >>>>>>>>>>>> change....
> >>>>
> >>>>> On 8/23/16 2:06 AM, Chris Hillery wrote:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> When you get the "disk full" warning, do a quick "df -i" on
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>> the
> >>>
> >>>> device
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> -
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> possibly you've run out of inodes even if the space isn't all
> >>>>>>>>>>>>>
> >>>>>>>>>>>> used
> >>>>
> >>>>> up.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> It's
> >>>>>>>>>>>>>> unlikely because I don't think AsterixDB creates a bunch of
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> small
> >>>>
> >>>>> files,
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> but worth checking.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> If that's not it, then can you share the full exception and
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>> stack
> >>>>
> >>>>> trace?
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Ceej
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> aka Chris Hillery
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> On Tue, Aug 23, 2016 at 1:59 AM, Wail Alkowaileet <
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> wael.y.k@gmail.com>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>> I just cleared the hard drives to get 80% free space. I
> still
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> get
> >>>>
> >>>>> the
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> same
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> issue.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> The data contains:
> >>>>>>>>>>>>>>>>> 1- 2887453794 records.
> >>>>>>>>>>>>>>>>> 2- Schema:
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> create type CDRType as {
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> id:uuid,
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> 'date':string,
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> 'time':string,
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> 'duration':int64,
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> 'caller':int64,
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> 'callee':int64,
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> location:point?
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> }
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> On Tue, Aug 23, 2016 at 9:06 AM, Wail Alkowaileet <
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> wael.y.k@gmail.com
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Dears,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I have a dataset of size 290GB loaded in a 3 NCs each of
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> which
> >>>>
> >>>>> has
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> 2x500GB
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> SSD.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> Each of NC has two IODevices (partitions) in each hard
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> drive
> >>>
> >>>> (i.e
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> total is 4 iodevices per NC). After loading the data,
> each
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Asterix
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> partition occupied 31GB.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> The cluster has about 50% free space in each hard drive
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> (approximately
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> about 250GB free space in each hard drive). However,
> when I
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> tried
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> create
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> an index of type RTree, I got an exception that no space
> left
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> in
> >>>
> >>>> the
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> hard
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> drive during the External Sort phase.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Is that normal ?
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> *Regards,*
> >>>>>>>>>>>>>>>>>> Wail Alkowaileet
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> *Regards,*
> >>>>>>>>>>>>>>>>> Wail Alkowaileet
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> *Regards,*
> >>>>>>>>>>>>>> Wail Alkowaileet
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> *Regards,*
> >>>>>>>>>>>>> Wail Alkowaileet
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> --
> >>>>>>>>>>>>>
> >>>>>>>>>>>> *Regards,*
> >>>>>>>>>> Wail Alkowaileet
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Best,
> >>>>>>>>>
> >>>>>>>>> Jianfeng Jia
> >>>>>>>>> PhD Candidate of Computer Science
> >>>>>>>>> University of California, Irvine
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>> --
> >>>>
> >>>> *Regards,*
> >>>> Wail Alkowaileet
> >>>>
> >>>>
> >
>
>
> --
>
> *Regards,*
> Wail Alkowaileet
>

Re: Creating RTree: no space left

Posted by Wail Alkowaileet <wa...@gmail.com>.

Hi Ahmed and Mike,

@Ahmed
I actually did a small experiment where I loaded about 1/5 of the data (so
I can index it) and seems that the R-Tree was really useful for querying
small regions or neighborhoods.
I also tried the B-Tree and it was slower than a full scan.

@Mike
Unfortunately, I cannot still even after anonymization :-)


On Wed, Sep 14, 2016 at 11:29 PM, Mike Carey <dt...@gmail.com> wrote:

> Interesting point, so to speak.  @Wail, any chance you could post a Google
> maps screenshot showing a visualization of the points in this dataset on
> the underlying geographic region?  (If the dataset is shareable in that
> anonymized form?)  I would think an R-tree would still be good for
> small-region geo queries - possibly shrinking the candidate object set by a
> factor of 10,000 - so still useful - and we also do index-AND-ing now, so
> we would also combine that shrinkage by other index-provided shrinkage on
> any other index-amenable predicates.  I think the queries are still spatial
> in nature, and the only AsterixDB choices for that are R-tree.  (We did
> experiments with things like Hilbert B-trees, but the results led to the
> conclusion that the code base only needs R-trees for spatial data for the
> forseeable future - they just work too well and in a no-tuning-required
> fashion.... :-))
>
>
>
> On 9/14/16 12:49 PM, Ahmed Eldawy wrote:
>
>> Looks like an interesting case. Just a small question. Are you sure a
>> spatial index is the right one to use here? The spatial attribute looks
>> more like a categorization and a hash or B-tree index could be more
>> suitable. As far as I know, the spatial index in AsterixDB is a secondary
>> R-tree index which, like any other secondary index, is only good for
>> retrieving a small number of records. For this dataset, it seems that any
>> small range would still return a huge number of records.
>>
>> It is still interesting to further investigate and fix the sort issue but
>> I
>> mentioned the usage issue for a different perspective.
>>
>> Thanks
>> Ahmed
>>
>> On Wed, Sep 14, 2016 at 10:30 AM Mike Carey <dt...@gmail.com> wrote:
>>
>> ☺!
>>>
>>> On Sep 14, 2016 1:11 AM, "Wail Alkowaileet" <wa...@gmail.com> wrote:
>>>
>>> To be exact
>>>> I have 2,255,091,590 records and 10,391 points :-)
>>>>
>>>> On Wed, Sep 14, 2016 at 10:46 AM, Mike Carey <dt...@gmail.com> wrote:
>>>>
>>>> Thx!  I knew I'd meant to "activate" the thought somehow, but couldn't
>>>>> remember having done it for sure.  Oops! Scattered from VLDB, I
>>>>>
>>>> guess...!
>>>
>>>>
>>>>>
>>>>> On 9/13/16 9:58 PM, Taewoo Kim wrote:
>>>>>
>>>>> @Mike: You filed an issue -
>>>>>> https://issues.apache.org/jira/browse/ASTERIXDB-1639. :-)
>>>>>>
>>>>>> Best,
>>>>>> Taewoo
>>>>>>
>>>>>> On Tue, Sep 13, 2016 at 9:28 PM, Mike Carey <dt...@gmail.com>
>>>>>>
>>>>> wrote:
>>>
>>>> I can't remember (slight jetlag? :-)) if I shared back to this list
>>>>>>
>>>>> one
>>>
>>>> theory that came up in India when Wail and I talked F2F - his data
>>>>>>>
>>>>>> has
>>>
>>>> a
>>>>
>>>>> lot of duplicate points, so maybe something goes awry in that case.
>>>>>>>
>>>>>> I
>>>
>>>> wonder if we've sufficiently tested that case?  (E.g., what if there
>>>>>>>
>>>>>> are
>>>>
>>>>> gazillions of records originating from a small handful of points?)
>>>>>>>
>>>>>>>
>>>>>>> On 8/26/16 9:55 AM, Taewoo Kim wrote:
>>>>>>>
>>>>>>> Based on a rough calculation, per partition, each point field takes
>>>>>>>
>>>>>> 3.6GB
>>>>
>>>>> (16 bytes * 2887453794 records / 12 partition). To sort 3.6GB, we
>>>>>>>>
>>>>>>> are
>>>
>>>> generating 625 files (96MB or 128MB each) = 157GB. Since Wail
>>>>>>>>
>>>>>>> mentioned
>>>>
>>>>> that there was no issue when creating a B+ tree index, we need to
>>>>>>>>
>>>>>>> check
>>>>
>>>>> what SORT process is required by R-Tree index.
>>>>>>>>
>>>>>>>> Best,
>>>>>>>> Taewoo
>>>>>>>>
>>>>>>>> On Fri, Aug 26, 2016 at 7:52 AM, Jianfeng Jia <
>>>>>>>>
>>>>>>> jianfeng.jia@gmail.com
>>>
>>>> wrote:
>>>>>>>>
>>>>>>>> If all of the file names start with “ExternalSortRunGenerator”, then
>>>>>>>> they
>>>>>>>>
>>>>>>>> are the first round files which can not be GCed.
>>>>>>>>> Could you provide the query plan as well?
>>>>>>>>>
>>>>>>>>> On Aug 24, 2016, at 10:02 PM, Wail Alkowaileet <wael.y.k@gmail.com
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Hi Ian and Pouria,
>>>>>>>>>
>>>>>>>>>> The name of the files along with the sizes (there were 625 one of
>>>>>>>>>> those
>>>>>>>>>> before crashing):
>>>>>>>>>>
>>>>>>>>>> size        name
>>>>>>>>>> 96MB     ExternalSortRunGenerator8917133039835449370.waf
>>>>>>>>>> 128MB   ExternalSortRunGenerator8948724728025392343.waf
>>>>>>>>>>
>>>>>>>>>> no files were generated beyond runs.
>>>>>>>>>> compiler.sortmemory = 64MB
>>>>>>>>>>
>>>>>>>>>> Here is the full logs
>>>>>>>>>> <https://www.dropbox.com/s/k2qbo3wybc8mnnk/log_Thu_Aug_
>>>>>>>>>>
>>>>>>>>>> 25_07%3A34%3A52_AST_2016.zip?dl=0>
>>>>>>>>>>
>>>>>>>>> On Tue, Aug 23, 2016 at 9:29 PM, Pouria Pirzadeh <
>>>>>>>>>
>>>>>>>>>> pouria.pirzadeh@gmail.com>
>>>>>>>>>>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> We previously had issues with huge spilled sort temp files when
>>>>>>>>>> creating
>>>>>>>>>>
>>>>>>>>>> inverted index for fuzzy queries, but NOT R-Trees.
>>>>>>>>>>> I also recall that Yingyi fixed the issue of delaying clean-up
>>>>>>>>>>>
>>>>>>>>>> for
>>>
>>>> intermediate temp files until the end of the query execution.
>>>>>>>>>>> If you can share names of a couple of temp files (and their sizes
>>>>>>>>>>> along
>>>>>>>>>>> with the sort memory setting you have in
>>>>>>>>>>>
>>>>>>>>>> asterix-configuration.xml)
>>>
>>>> we
>>>>>>>>>>>
>>>>>>>>>>> may
>>>>>>>>>>>
>>>>>>>>>> be able to have a better guess as if the sort is really going
>>>>>>>>>>
>>>>>>>>> into a
>>>
>>>> two-level merge or not.
>>>>>>>>>>>
>>>>>>>>>>> Pouria
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Aug 23, 2016 at 11:09 AM, Ian Maxon <im...@uci.edu>
>>>>>>>>>>>
>>>>>>>>>> wrote:
>>>>
>>>>> I think that execption ("No space left on device") is just casted
>>>>>>>>>>> from
>>>>>>>>>>> the
>>>>>>>>>>>
>>>>>>>>>>> native IOException. Therefore I would be inclined to believe it's
>>>>>>>>>>>
>>>>>>>>>>>> genuinely
>>>>>>>>>>>>
>>>>>>>>>>> out of space. I suppose the question is why the external sort is
>>>>>>>>>>>
>>>>>>>>>> so
>>>
>>>> huge.
>>>>>>>>>>>>
>>>>>>>>>>> What is the query plan? Maybe that will shed light on a possible
>>>>>>>>>> cause.
>>>>>>>>>>
>>>>>>>>>> On Tue, Aug 23, 2016 at 9:59 AM, Wail Alkowaileet <
>>>>>>>>>>>
>>>>>>>>>>>> wael.y.k@gmail.com
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> I was monitoring Inodes ... it didn't go beyond 1%.
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, Aug 23, 2016 at 7:58 PM, Wail Alkowaileet <
>>>>>>>>>>>>> wael.y.k@gmail.com
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Chris and Mike,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Actually I was monitoring it to see what's going on:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>      - The size of each partition is about 40GB (80GB in total
>>>>>>>>>>>>>>
>>>>>>>>>>>>> per
>>>>
>>>>>      iodevice).
>>>>>>>>>>>>>>      - The runs took 157GB per iodevice (about 2x of the
>>>>>>>>>>>>>>
>>>>>>>>>>>>> dataset
>>>
>>>> size).
>>>>>>>>>>>>>>      Each run takes either of 128MB or 96MB of storage.
>>>>>>>>>>>>>>      - At a certain time, there were 522 runs.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I even tried to create a BTree Index to see if that happens as
>>>>>>>>>>>>>> well.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I
>>>>>>>>>>>>>>
>>>>>>>>>>>>> created two BTree indexes one for the *location* and one for
>>>>>>>>>>>> the
>>>>>>>>>>>>
>>>>>>>>>>>> *caller
>>>>>>>>>>>>> *and
>>>>>>>>>>>>>
>>>>>>>>>>>>> they were created successfully. The sizes of the runs didn't
>>>>>>>>>>>>>
>>>>>>>>>>>> take
>>>
>>>> anyway
>>>>>>>>>>>>>>
>>>>>>>>>>>>> near that.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Logs are attached.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Tue, Aug 23, 2016 at 7:19 PM, Mike Carey <
>>>>>>>>>>>>>>
>>>>>>>>>>>>> dtabass@gmail.com>
>>>
>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>> I think we might have "file GC issues" - I vaguely remember
>>>>>>>>>>>> that
>>>>>>>>>>>>
>>>>>>>>>>> we
>>>>
>>>>> don't
>>>>>>>>>>>>>
>>>>>>>>>>>>>> (or at least didn't once upon a time) proactively remove
>>>>>>>>>>>>>> unnecessary
>>>>>>>>>>>>>> run
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> files - removing all of them at end-of-job instead of at the
>>>>>>>>>>>>>
>>>>>>>>>>>> end
>>>
>>>> of
>>>>
>>>>> the
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> execution phase that uses their contents.  We may also have an
>>>>>>>>>>>>>
>>>>>>>>>>>>> "Amdahl
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> problem" right now with our sort since we serialize phase two
>>>>>>>>>>>>>
>>>>>>>>>>>> of
>>>
>>>> parallel
>>>>>>>>>>>>>
>>>>>>>>>>>>>> sorts - though this is not a query, it's index build, so that
>>>>>>>>>>>>>> shouldn't
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> be
>>>>>>>>>>>>>
>>>>>>>>>>>>> it.  It would be interesting to put a df/sleep script on each
>>>>>>>>>>>>>
>>>>>>>>>>>> of
>>>
>>>> the
>>>>>>>>>>>>>> nodes
>>>>>>>>>>>>>> when this is happening - actually a script that monitors the
>>>>>>>>>>>>>>
>>>>>>>>>>>>> temp
>>>>
>>>>> file
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> directory - and watch the lifecycle happen and the sizes
>>>>>>>>>>>>>
>>>>>>>>>>>> change....
>>>>
>>>>> On 8/23/16 2:06 AM, Chris Hillery wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> When you get the "disk full" warning, do a quick "df -i" on
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> the
>>>
>>>> device
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> -
>>>>>>>>>>>>>>
>>>>>>>>>>>>> possibly you've run out of inodes even if the space isn't all
>>>>>>>>>>>>>
>>>>>>>>>>>> used
>>>>
>>>>> up.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> It's
>>>>>>>>>>>>>> unlikely because I don't think AsterixDB creates a bunch of
>>>>>>>>>>>>>>
>>>>>>>>>>>>> small
>>>>
>>>>> files,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> but worth checking.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> If that's not it, then can you share the full exception and
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> stack
>>>>
>>>>> trace?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Ceej
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> aka Chris Hillery
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Tue, Aug 23, 2016 at 1:59 AM, Wail Alkowaileet <
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> wael.y.k@gmail.com>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>> I just cleared the hard drives to get 80% free space. I still
>>>>>>>>>>>>>>
>>>>>>>>>>>>> get
>>>>
>>>>> the
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> same
>>>>>>>>>>>>>>
>>>>>>>>>>>>> issue.
>>>>>>>>>>>>>
>>>>>>>>>>>>>> The data contains:
>>>>>>>>>>>>>>>>> 1- 2887453794 records.
>>>>>>>>>>>>>>>>> 2- Schema:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> create type CDRType as {
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> id:uuid,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 'date':string,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 'time':string,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 'duration':int64,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 'caller':int64,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 'callee':int64,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> location:point?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Tue, Aug 23, 2016 at 9:06 AM, Wail Alkowaileet <
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> wael.y.k@gmail.com
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Dears,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I have a dataset of size 290GB loaded in a 3 NCs each of
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> which
>>>>
>>>>> has
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 2x500GB
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> SSD.
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Each of NC has two IODevices (partitions) in each hard
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> drive
>>>
>>>> (i.e
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> total is 4 iodevices per NC). After loading the data, each
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Asterix
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> partition occupied 31GB.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The cluster has about 50% free space in each hard drive
>>>>>>>>>>>>>
>>>>>>>>>>>>>> (approximately
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> about 250GB free space in each hard drive). However, when I
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> tried
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> create
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> an index of type RTree, I got an exception that no space left
>>>>>>>>>>>>>>
>>>>>>>>>>>>> in
>>>
>>>> the
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> hard
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> drive during the External Sort phase.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Is that normal ?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> *Regards,*
>>>>>>>>>>>>>>>>>> Wail Alkowaileet
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> *Regards,*
>>>>>>>>>>>>>>>>> Wail Alkowaileet
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> *Regards,*
>>>>>>>>>>>>>> Wail Alkowaileet
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>
>>>>>>>>>>>>> *Regards,*
>>>>>>>>>>>>> Wail Alkowaileet
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>>
>>>>>>>>>>>> *Regards,*
>>>>>>>>>> Wail Alkowaileet
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Best,
>>>>>>>>>
>>>>>>>>> Jianfeng Jia
>>>>>>>>> PhD Candidate of Computer Science
>>>>>>>>> University of California, Irvine
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>> --
>>>>
>>>> *Regards,*
>>>> Wail Alkowaileet
>>>>
>>>>
>


-- 

*Regards,*
Wail Alkowaileet

Re: Creating RTree: no space left

Posted by Mike Carey <dt...@gmail.com>.

Interesting point, so to speak.  @Wail, any chance you could post a 
Google maps screenshot showing a visualization of the points in this 
dataset on the underlying geographic region?  (If the dataset is 
shareable in that anonymized form?)  I would think an R-tree would still 
be good for small-region geo queries - possibly shrinking the candidate 
object set by a factor of 10,000 - so still useful - and we also do 
index-AND-ing now, so we would also combine that shrinkage by other 
index-provided shrinkage on any other index-amenable predicates.  I 
think the queries are still spatial in nature, and the only AsterixDB 
choices for that are R-tree.  (We did experiments with things like 
Hilbert B-trees, but the results led to the conclusion that the code 
base only needs R-trees for spatial data for the forseeable future - 
they just work too well and in a no-tuning-required fashion.... :-))


On 9/14/16 12:49 PM, Ahmed Eldawy wrote:
> Looks like an interesting case. Just a small question. Are you sure a
> spatial index is the right one to use here? The spatial attribute looks
> more like a categorization and a hash or B-tree index could be more
> suitable. As far as I know, the spatial index in AsterixDB is a secondary
> R-tree index which, like any other secondary index, is only good for
> retrieving a small number of records. For this dataset, it seems that any
> small range would still return a huge number of records.
>
> It is still interesting to further investigate and fix the sort issue but I
> mentioned the usage issue for a different perspective.
>
> Thanks
> Ahmed
>
> On Wed, Sep 14, 2016 at 10:30 AM Mike Carey <dt...@gmail.com> wrote:
>
>> \u263a!
>>
>> On Sep 14, 2016 1:11 AM, "Wail Alkowaileet" <wa...@gmail.com> wrote:
>>
>>> To be exact
>>> I have 2,255,091,590 records and 10,391 points :-)
>>>
>>> On Wed, Sep 14, 2016 at 10:46 AM, Mike Carey <dt...@gmail.com> wrote:
>>>
>>>> Thx!  I knew I'd meant to "activate" the thought somehow, but couldn't
>>>> remember having done it for sure.  Oops! Scattered from VLDB, I
>> guess...!
>>>>
>>>>
>>>> On 9/13/16 9:58 PM, Taewoo Kim wrote:
>>>>
>>>>> @Mike: You filed an issue -
>>>>> https://issues.apache.org/jira/browse/ASTERIXDB-1639. :-)
>>>>>
>>>>> Best,
>>>>> Taewoo
>>>>>
>>>>> On Tue, Sep 13, 2016 at 9:28 PM, Mike Carey <dt...@gmail.com>
>> wrote:
>>>>> I can't remember (slight jetlag? :-)) if I shared back to this list
>> one
>>>>>> theory that came up in India when Wail and I talked F2F - his data
>> has
>>> a
>>>>>> lot of duplicate points, so maybe something goes awry in that case.
>> I
>>>>>> wonder if we've sufficiently tested that case?  (E.g., what if there
>>> are
>>>>>> gazillions of records originating from a small handful of points?)
>>>>>>
>>>>>>
>>>>>> On 8/26/16 9:55 AM, Taewoo Kim wrote:
>>>>>>
>>>>>> Based on a rough calculation, per partition, each point field takes
>>> 3.6GB
>>>>>>> (16 bytes * 2887453794 records / 12 partition). To sort 3.6GB, we
>> are
>>>>>>> generating 625 files (96MB or 128MB each) = 157GB. Since Wail
>>> mentioned
>>>>>>> that there was no issue when creating a B+ tree index, we need to
>>> check
>>>>>>> what SORT process is required by R-Tree index.
>>>>>>>
>>>>>>> Best,
>>>>>>> Taewoo
>>>>>>>
>>>>>>> On Fri, Aug 26, 2016 at 7:52 AM, Jianfeng Jia <
>> jianfeng.jia@gmail.com
>>>>>>> wrote:
>>>>>>>
>>>>>>> If all of the file names start with \u201cExternalSortRunGenerator\u201d, then
>>>>>>> they
>>>>>>>
>>>>>>>> are the first round files which can not be GCed.
>>>>>>>> Could you provide the query plan as well?
>>>>>>>>
>>>>>>>> On Aug 24, 2016, at 10:02 PM, Wail Alkowaileet <wael.y.k@gmail.com
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Hi Ian and Pouria,
>>>>>>>>> The name of the files along with the sizes (there were 625 one of
>>>>>>>>> those
>>>>>>>>> before crashing):
>>>>>>>>>
>>>>>>>>> size        name
>>>>>>>>> 96MB     ExternalSortRunGenerator8917133039835449370.waf
>>>>>>>>> 128MB   ExternalSortRunGenerator8948724728025392343.waf
>>>>>>>>>
>>>>>>>>> no files were generated beyond runs.
>>>>>>>>> compiler.sortmemory = 64MB
>>>>>>>>>
>>>>>>>>> Here is the full logs
>>>>>>>>> <https://www.dropbox.com/s/k2qbo3wybc8mnnk/log_Thu_Aug_
>>>>>>>>>
>>>>>>>>> 25_07%3A34%3A52_AST_2016.zip?dl=0>
>>>>>>>> On Tue, Aug 23, 2016 at 9:29 PM, Pouria Pirzadeh <
>>>>>>>>> pouria.pirzadeh@gmail.com>
>>>>>>>> wrote:
>>>>>>>>> We previously had issues with huge spilled sort temp files when
>>>>>>>>> creating
>>>>>>>>>
>>>>>>>>>> inverted index for fuzzy queries, but NOT R-Trees.
>>>>>>>>>> I also recall that Yingyi fixed the issue of delaying clean-up
>> for
>>>>>>>>>> intermediate temp files until the end of the query execution.
>>>>>>>>>> If you can share names of a couple of temp files (and their sizes
>>>>>>>>>> along
>>>>>>>>>> with the sort memory setting you have in
>> asterix-configuration.xml)
>>>>>>>>>> we
>>>>>>>>>>
>>>>>>>>>> may
>>>>>>>>> be able to have a better guess as if the sort is really going
>> into a
>>>>>>>>>> two-level merge or not.
>>>>>>>>>>
>>>>>>>>>> Pouria
>>>>>>>>>>
>>>>>>>>>> On Tue, Aug 23, 2016 at 11:09 AM, Ian Maxon <im...@uci.edu>
>>> wrote:
>>>>>>>>>> I think that execption ("No space left on device") is just casted
>>>>>>>>>> from
>>>>>>>>>> the
>>>>>>>>>>
>>>>>>>>>> native IOException. Therefore I would be inclined to believe it's
>>>>>>>>>>> genuinely
>>>>>>>>>> out of space. I suppose the question is why the external sort is
>> so
>>>>>>>>>>> huge.
>>>>>>>>> What is the query plan? Maybe that will shed light on a possible
>>>>>>>>> cause.
>>>>>>>>>
>>>>>>>>>> On Tue, Aug 23, 2016 at 9:59 AM, Wail Alkowaileet <
>>>>>>>>>>> wael.y.k@gmail.com
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> I was monitoring Inodes ... it didn't go beyond 1%.
>>>>>>>>>>>
>>>>>>>>>>>> On Tue, Aug 23, 2016 at 7:58 PM, Wail Alkowaileet <
>>>>>>>>>>>> wael.y.k@gmail.com
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> Hi Chris and Mike,
>>>>>>>>>>>>
>>>>>>>>>>>>> Actually I was monitoring it to see what's going on:
>>>>>>>>>>>>>
>>>>>>>>>>>>>      - The size of each partition is about 40GB (80GB in total
>>> per
>>>>>>>>>>>>>      iodevice).
>>>>>>>>>>>>>      - The runs took 157GB per iodevice (about 2x of the
>> dataset
>>>>>>>>>>>>> size).
>>>>>>>>>>>>>      Each run takes either of 128MB or 96MB of storage.
>>>>>>>>>>>>>      - At a certain time, there were 522 runs.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I even tried to create a BTree Index to see if that happens as
>>>>>>>>>>>>> well.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I
>>>>>>>>>>> created two BTree indexes one for the *location* and one for the
>>>>>>>>>>>
>>>>>>>>>>>> *caller
>>>>>>>>>>>> *and
>>>>>>>>>>>>
>>>>>>>>>>>> they were created successfully. The sizes of the runs didn't
>> take
>>>>>>>>>>>>> anyway
>>>>>>>>>>>> near that.
>>>>>>>>>>>>
>>>>>>>>>>>>> Logs are attached.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Tue, Aug 23, 2016 at 7:19 PM, Mike Carey <
>> dtabass@gmail.com>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>> I think we might have "file GC issues" - I vaguely remember that
>>> we
>>>>>>>>>>>> don't
>>>>>>>>>>>>> (or at least didn't once upon a time) proactively remove
>>>>>>>>>>>>> unnecessary
>>>>>>>>>>>>> run
>>>>>>>>>>>>>
>>>>>>>>>>>> files - removing all of them at end-of-job instead of at the
>> end
>>> of
>>>>>>>>>>>>> the
>>>>>>>>>>>>>
>>>>>>>>>>>> execution phase that uses their contents.  We may also have an
>>>>>>>>>>>>
>>>>>>>>>>>>> "Amdahl
>>>>>>>>>>>>>
>>>>>>>>>>>> problem" right now with our sort since we serialize phase two
>> of
>>>>>>>>>>>> parallel
>>>>>>>>>>>>> sorts - though this is not a query, it's index build, so that
>>>>>>>>>>>>> shouldn't
>>>>>>>>>>>>>
>>>>>>>>>>>> be
>>>>>>>>>>>>
>>>>>>>>>>>> it.  It would be interesting to put a df/sleep script on each
>> of
>>>>>>>>>>>>> the
>>>>>>>>>>>>> nodes
>>>>>>>>>>>>> when this is happening - actually a script that monitors the
>>> temp
>>>>>>>>>>>>> file
>>>>>>>>>>>>>
>>>>>>>>>>>> directory - and watch the lifecycle happen and the sizes
>>> change....
>>>>>>>>>>>>>> On 8/23/16 2:06 AM, Chris Hillery wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> When you get the "disk full" warning, do a quick "df -i" on
>> the
>>>>>>>>>>>>>> device
>>>>>>>>>>>>>>
>>>>>>>>>>>>> -
>>>>>>>>>>>> possibly you've run out of inodes even if the space isn't all
>>> used
>>>>>>>>>>>>>> up.
>>>>>>>>>>>>>>
>>>>>>>>>>>>> It's
>>>>>>>>>>>>> unlikely because I don't think AsterixDB creates a bunch of
>>> small
>>>>>>>>>>>>>>> files,
>>>>>>>>>>>>> but worth checking.
>>>>>>>>>>>>>
>>>>>>>>>>>>>> If that's not it, then can you share the full exception and
>>> stack
>>>>>>>>>>>>>>> trace?
>>>>>>>>>>>>> Ceej
>>>>>>>>>>>>>
>>>>>>>>>>>>>> aka Chris Hillery
>>>>>>>>>>>>>>> On Tue, Aug 23, 2016 at 1:59 AM, Wail Alkowaileet <
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> wael.y.k@gmail.com>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>> I just cleared the hard drives to get 80% free space. I still
>>> get
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>> same
>>>>>>>>>>>> issue.
>>>>>>>>>>>>>>>> The data contains:
>>>>>>>>>>>>>>>> 1- 2887453794 records.
>>>>>>>>>>>>>>>> 2- Schema:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> create type CDRType as {
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> id:uuid,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 'date':string,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 'time':string,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 'duration':int64,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 'caller':int64,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 'callee':int64,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> location:point?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Tue, Aug 23, 2016 at 9:06 AM, Wail Alkowaileet <
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> wael.y.k@gmail.com
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>> Dears,
>>>>>>>>>>>>>>>> I have a dataset of size 290GB loaded in a 3 NCs each of
>>> which
>>>>>>>>>>>>>>>>> has
>>>>>>>>>>>>>>> 2x500GB
>>>>>>>>>>>> SSD.
>>>>>>>>>>>>>>>>> Each of NC has two IODevices (partitions) in each hard
>> drive
>>>>>>>>>>>>>>>>> (i.e
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>> total is 4 iodevices per NC). After loading the data, each
>>>>>>>>>>>>> Asterix
>>>>>>>>>>>>>>> partition occupied 31GB.
>>>>>>>>>>>> The cluster has about 50% free space in each hard drive
>>>>>>>>>>>>>>>>> (approximately
>>>>>>>>>>>>>>> about 250GB free space in each hard drive). However, when I
>>>>>>>>>>>>> tried
>>>>>>>>>>>>>
>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>> create
>>>>>>>>>>>>> an index of type RTree, I got an exception that no space left
>> in
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> hard
>>>>>>>>>>>>> drive during the External Sort phase.
>>>>>>>>>>>>>>>>> Is that normal ?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> *Regards,*
>>>>>>>>>>>>>>>>> Wail Alkowaileet
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> *Regards,*
>>>>>>>>>>>>>>>> Wail Alkowaileet
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>> *Regards,*
>>>>>>>>>>>>> Wail Alkowaileet
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>> *Regards,*
>>>>>>>>>>>> Wail Alkowaileet
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>> *Regards,*
>>>>>>>>> Wail Alkowaileet
>>>>>>>>>
>>>>>>>>>
>>>>>>>> Best,
>>>>>>>>
>>>>>>>> Jianfeng Jia
>>>>>>>> PhD Candidate of Computer Science
>>>>>>>> University of California, Irvine
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>
>>> --
>>>
>>> *Regards,*
>>> Wail Alkowaileet
>>>

Re: Creating RTree: no space left

Posted by Ahmed Eldawy <el...@cs.ucr.edu>.

Looks like an interesting case. Just a small question. Are you sure a
spatial index is the right one to use here? The spatial attribute looks
more like a categorization and a hash or B-tree index could be more
suitable. As far as I know, the spatial index in AsterixDB is a secondary
R-tree index which, like any other secondary index, is only good for
retrieving a small number of records. For this dataset, it seems that any
small range would still return a huge number of records.

It is still interesting to further investigate and fix the sort issue but I
mentioned the usage issue for a different perspective.

Thanks
Ahmed

On Wed, Sep 14, 2016 at 10:30 AM Mike Carey <dt...@gmail.com> wrote:

> ☺!
>
> On Sep 14, 2016 1:11 AM, "Wail Alkowaileet" <wa...@gmail.com> wrote:
>
> > To be exact
> > I have 2,255,091,590 records and 10,391 points :-)
> >
> > On Wed, Sep 14, 2016 at 10:46 AM, Mike Carey <dt...@gmail.com> wrote:
> >
> > > Thx!  I knew I'd meant to "activate" the thought somehow, but couldn't
> > > remember having done it for sure.  Oops! Scattered from VLDB, I
> guess...!
> > >
> > >
> > >
> > > On 9/13/16 9:58 PM, Taewoo Kim wrote:
> > >
> > >> @Mike: You filed an issue -
> > >> https://issues.apache.org/jira/browse/ASTERIXDB-1639. :-)
> > >>
> > >> Best,
> > >> Taewoo
> > >>
> > >> On Tue, Sep 13, 2016 at 9:28 PM, Mike Carey <dt...@gmail.com>
> wrote:
> > >>
> > >> I can't remember (slight jetlag? :-)) if I shared back to this list
> one
> > >>> theory that came up in India when Wail and I talked F2F - his data
> has
> > a
> > >>> lot of duplicate points, so maybe something goes awry in that case.
> I
> > >>> wonder if we've sufficiently tested that case?  (E.g., what if there
> > are
> > >>> gazillions of records originating from a small handful of points?)
> > >>>
> > >>>
> > >>> On 8/26/16 9:55 AM, Taewoo Kim wrote:
> > >>>
> > >>> Based on a rough calculation, per partition, each point field takes
> > 3.6GB
> > >>>> (16 bytes * 2887453794 records / 12 partition). To sort 3.6GB, we
> are
> > >>>> generating 625 files (96MB or 128MB each) = 157GB. Since Wail
> > mentioned
> > >>>> that there was no issue when creating a B+ tree index, we need to
> > check
> > >>>> what SORT process is required by R-Tree index.
> > >>>>
> > >>>> Best,
> > >>>> Taewoo
> > >>>>
> > >>>> On Fri, Aug 26, 2016 at 7:52 AM, Jianfeng Jia <
> jianfeng.jia@gmail.com
> > >
> > >>>> wrote:
> > >>>>
> > >>>> If all of the file names start with “ExternalSortRunGenerator”, then
> > >>>> they
> > >>>>
> > >>>>> are the first round files which can not be GCed.
> > >>>>> Could you provide the query plan as well?
> > >>>>>
> > >>>>> On Aug 24, 2016, at 10:02 PM, Wail Alkowaileet <wael.y.k@gmail.com
> >
> > >>>>> wrote:
> > >>>>>
> > >>>>> Hi Ian and Pouria,
> > >>>>>>
> > >>>>>> The name of the files along with the sizes (there were 625 one of
> > >>>>>> those
> > >>>>>> before crashing):
> > >>>>>>
> > >>>>>> size        name
> > >>>>>> 96MB     ExternalSortRunGenerator8917133039835449370.waf
> > >>>>>> 128MB   ExternalSortRunGenerator8948724728025392343.waf
> > >>>>>>
> > >>>>>> no files were generated beyond runs.
> > >>>>>> compiler.sortmemory = 64MB
> > >>>>>>
> > >>>>>> Here is the full logs
> > >>>>>> <https://www.dropbox.com/s/k2qbo3wybc8mnnk/log_Thu_Aug_
> > >>>>>>
> > >>>>>> 25_07%3A34%3A52_AST_2016.zip?dl=0>
> > >>>>>
> > >>>>> On Tue, Aug 23, 2016 at 9:29 PM, Pouria Pirzadeh <
> > >>>>>>
> > >>>>>> pouria.pirzadeh@gmail.com>
> > >>>>>
> > >>>>> wrote:
> > >>>>>>
> > >>>>>> We previously had issues with huge spilled sort temp files when
> > >>>>>> creating
> > >>>>>>
> > >>>>>>> inverted index for fuzzy queries, but NOT R-Trees.
> > >>>>>>> I also recall that Yingyi fixed the issue of delaying clean-up
> for
> > >>>>>>> intermediate temp files until the end of the query execution.
> > >>>>>>> If you can share names of a couple of temp files (and their sizes
> > >>>>>>> along
> > >>>>>>> with the sort memory setting you have in
> asterix-configuration.xml)
> > >>>>>>> we
> > >>>>>>>
> > >>>>>>> may
> > >>>>>> be able to have a better guess as if the sort is really going
> into a
> > >>>>>>
> > >>>>>>> two-level merge or not.
> > >>>>>>>
> > >>>>>>> Pouria
> > >>>>>>>
> > >>>>>>> On Tue, Aug 23, 2016 at 11:09 AM, Ian Maxon <im...@uci.edu>
> > wrote:
> > >>>>>>>
> > >>>>>>> I think that execption ("No space left on device") is just casted
> > >>>>>>> from
> > >>>>>>> the
> > >>>>>>>
> > >>>>>>> native IOException. Therefore I would be inclined to believe it's
> > >>>>>>>>
> > >>>>>>>> genuinely
> > >>>>>>>
> > >>>>>>> out of space. I suppose the question is why the external sort is
> so
> > >>>>>>>>
> > >>>>>>>> huge.
> > >>>>>>>
> > >>>>>> What is the query plan? Maybe that will shed light on a possible
> > >>>>>> cause.
> > >>>>>>
> > >>>>>>> On Tue, Aug 23, 2016 at 9:59 AM, Wail Alkowaileet <
> > >>>>>>>> wael.y.k@gmail.com
> > >>>>>>>> wrote:
> > >>>>>>>>
> > >>>>>>>> I was monitoring Inodes ... it didn't go beyond 1%.
> > >>>>>>>>
> > >>>>>>>>> On Tue, Aug 23, 2016 at 7:58 PM, Wail Alkowaileet <
> > >>>>>>>>> wael.y.k@gmail.com
> > >>>>>>>>> wrote:
> > >>>>>>>>>
> > >>>>>>>>> Hi Chris and Mike,
> > >>>>>>>>>
> > >>>>>>>>>> Actually I was monitoring it to see what's going on:
> > >>>>>>>>>>
> > >>>>>>>>>>     - The size of each partition is about 40GB (80GB in total
> > per
> > >>>>>>>>>>     iodevice).
> > >>>>>>>>>>     - The runs took 157GB per iodevice (about 2x of the
> dataset
> > >>>>>>>>>> size).
> > >>>>>>>>>>     Each run takes either of 128MB or 96MB of storage.
> > >>>>>>>>>>     - At a certain time, there were 522 runs.
> > >>>>>>>>>>
> > >>>>>>>>>> I even tried to create a BTree Index to see if that happens as
> > >>>>>>>>>> well.
> > >>>>>>>>>>
> > >>>>>>>>>> I
> > >>>>>>>>>
> > >>>>>>>> created two BTree indexes one for the *location* and one for the
> > >>>>>>>>
> > >>>>>>>>> *caller
> > >>>>>>>>> *and
> > >>>>>>>>>
> > >>>>>>>>> they were created successfully. The sizes of the runs didn't
> take
> > >>>>>>>>>>
> > >>>>>>>>>> anyway
> > >>>>>>>>> near that.
> > >>>>>>>>>
> > >>>>>>>>>> Logs are attached.
> > >>>>>>>>>>
> > >>>>>>>>>> On Tue, Aug 23, 2016 at 7:19 PM, Mike Carey <
> dtabass@gmail.com>
> > >>>>>>>>>>
> > >>>>>>>>>> wrote:
> > >>>>>>>>>
> > >>>>>>>> I think we might have "file GC issues" - I vaguely remember that
> > we
> > >>>>>>>>
> > >>>>>>>>> don't
> > >>>>>>>>>> (or at least didn't once upon a time) proactively remove
> > >>>>>>>>>> unnecessary
> > >>>>>>>>>> run
> > >>>>>>>>>>
> > >>>>>>>>> files - removing all of them at end-of-job instead of at the
> end
> > of
> > >>>>>>>>>
> > >>>>>>>>>> the
> > >>>>>>>>>>
> > >>>>>>>>> execution phase that uses their contents.  We may also have an
> > >>>>>>>>>
> > >>>>>>>>>> "Amdahl
> > >>>>>>>>>>
> > >>>>>>>>> problem" right now with our sort since we serialize phase two
> of
> > >>>>>>>>
> > >>>>>>>>> parallel
> > >>>>>>>>>> sorts - though this is not a query, it's index build, so that
> > >>>>>>>>>> shouldn't
> > >>>>>>>>>>
> > >>>>>>>>> be
> > >>>>>>>>>
> > >>>>>>>>> it.  It would be interesting to put a df/sleep script on each
> of
> > >>>>>>>>>> the
> > >>>>>>>>>> nodes
> > >>>>>>>>>> when this is happening - actually a script that monitors the
> > temp
> > >>>>>>>>>> file
> > >>>>>>>>>>
> > >>>>>>>>> directory - and watch the lifecycle happen and the sizes
> > change....
> > >>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>> On 8/23/16 2:06 AM, Chris Hillery wrote:
> > >>>>>>>>>>>
> > >>>>>>>>>>> When you get the "disk full" warning, do a quick "df -i" on
> the
> > >>>>>>>>>>> device
> > >>>>>>>>>>>
> > >>>>>>>>>> -
> > >>>>>>>>>
> > >>>>>>>>> possibly you've run out of inodes even if the space isn't all
> > used
> > >>>>>>>>>>
> > >>>>>>>>>>> up.
> > >>>>>>>>>>>
> > >>>>>>>>>> It's
> > >>>>>>>>>
> > >>>>>>>>>> unlikely because I don't think AsterixDB creates a bunch of
> > small
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> files,
> > >>>>>>>>>>>
> > >>>>>>>>>> but worth checking.
> > >>>>>>>>>>
> > >>>>>>>>>>> If that's not it, then can you share the full exception and
> > stack
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> trace?
> > >>>>>>>>>>>
> > >>>>>>>>>> Ceej
> > >>>>>>>>>>
> > >>>>>>>>>>> aka Chris Hillery
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> On Tue, Aug 23, 2016 at 1:59 AM, Wail Alkowaileet <
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> wael.y.k@gmail.com>
> > >>>>>>>>>>>
> > >>>>>>>>>> wrote:
> > >>>>>>>>>
> > >>>>>>>>>> I just cleared the hard drives to get 80% free space. I still
> > get
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> the
> > >>>>>>>>>>>
> > >>>>>>>>>> same
> > >>>>>>>>
> > >>>>>>>>> issue.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> The data contains:
> > >>>>>>>>>>>>> 1- 2887453794 records.
> > >>>>>>>>>>>>> 2- Schema:
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> create type CDRType as {
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> id:uuid,
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> 'date':string,
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> 'time':string,
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> 'duration':int64,
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> 'caller':int64,
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> 'callee':int64,
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> location:point?
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> }
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> On Tue, Aug 23, 2016 at 9:06 AM, Wail Alkowaileet <
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> wael.y.k@gmail.com
> > >>>>>>>>>>>>
> > >>>>>>>>>>> wrote:
> > >>>>>>>>>
> > >>>>>>>>>> Dears,
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> I have a dataset of size 290GB loaded in a 3 NCs each of
> > which
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> has
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>> 2x500GB
> > >>>>>>>>
> > >>>>>>>>> SSD.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Each of NC has two IODevices (partitions) in each hard
> drive
> > >>>>>>>>>>>>>> (i.e
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> the
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>> total is 4 iodevices per NC). After loading the data, each
> > >>>>>>>>>
> > >>>>>>>>>> Asterix
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>> partition occupied 31GB.
> > >>>>>>>>
> > >>>>>>>>> The cluster has about 50% free space in each hard drive
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> (approximately
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>> about 250GB free space in each hard drive). However, when I
> > >>>>>>>>>> tried
> > >>>>>>>>>>
> > >>>>>>>>>>> to
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>> create
> > >>>>>>>>>
> > >>>>>>>>>> an index of type RTree, I got an exception that no space left
> in
> > >>>>>>>>>>>>> the
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>> hard
> > >>>>>>>>>
> > >>>>>>>>>> drive during the External Sort phase.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Is that normal ?
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> --
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> *Regards,*
> > >>>>>>>>>>>>>> Wail Alkowaileet
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> --
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>> *Regards,*
> > >>>>>>>>>>>>> Wail Alkowaileet
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> --
> > >>>>>>>>>>
> > >>>>>>>>>> *Regards,*
> > >>>>>>>>>> Wail Alkowaileet
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>> --
> > >>>>>>>>>
> > >>>>>>>>> *Regards,*
> > >>>>>>>>> Wail Alkowaileet
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> --
> > >>>>>>
> > >>>>>> *Regards,*
> > >>>>>> Wail Alkowaileet
> > >>>>>>
> > >>>>>>
> > >>>>> Best,
> > >>>>>
> > >>>>> Jianfeng Jia
> > >>>>> PhD Candidate of Computer Science
> > >>>>> University of California, Irvine
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>>
> > >
> >
> >
> > --
> >
> > *Regards,*
> > Wail Alkowaileet
> >
>

Re: Creating RTree: no space left

Posted by Mike Carey <dt...@gmail.com>.

☺!

On Sep 14, 2016 1:11 AM, "Wail Alkowaileet" <wa...@gmail.com> wrote:

> To be exact
> I have 2,255,091,590 records and 10,391 points :-)
>
> On Wed, Sep 14, 2016 at 10:46 AM, Mike Carey <dt...@gmail.com> wrote:
>
> > Thx!  I knew I'd meant to "activate" the thought somehow, but couldn't
> > remember having done it for sure.  Oops! Scattered from VLDB, I guess...!
> >
> >
> >
> > On 9/13/16 9:58 PM, Taewoo Kim wrote:
> >
> >> @Mike: You filed an issue -
> >> https://issues.apache.org/jira/browse/ASTERIXDB-1639. :-)
> >>
> >> Best,
> >> Taewoo
> >>
> >> On Tue, Sep 13, 2016 at 9:28 PM, Mike Carey <dt...@gmail.com> wrote:
> >>
> >> I can't remember (slight jetlag? :-)) if I shared back to this list one
> >>> theory that came up in India when Wail and I talked F2F - his data has
> a
> >>> lot of duplicate points, so maybe something goes awry in that case.  I
> >>> wonder if we've sufficiently tested that case?  (E.g., what if there
> are
> >>> gazillions of records originating from a small handful of points?)
> >>>
> >>>
> >>> On 8/26/16 9:55 AM, Taewoo Kim wrote:
> >>>
> >>> Based on a rough calculation, per partition, each point field takes
> 3.6GB
> >>>> (16 bytes * 2887453794 records / 12 partition). To sort 3.6GB, we are
> >>>> generating 625 files (96MB or 128MB each) = 157GB. Since Wail
> mentioned
> >>>> that there was no issue when creating a B+ tree index, we need to
> check
> >>>> what SORT process is required by R-Tree index.
> >>>>
> >>>> Best,
> >>>> Taewoo
> >>>>
> >>>> On Fri, Aug 26, 2016 at 7:52 AM, Jianfeng Jia <jianfeng.jia@gmail.com
> >
> >>>> wrote:
> >>>>
> >>>> If all of the file names start with “ExternalSortRunGenerator”, then
> >>>> they
> >>>>
> >>>>> are the first round files which can not be GCed.
> >>>>> Could you provide the query plan as well?
> >>>>>
> >>>>> On Aug 24, 2016, at 10:02 PM, Wail Alkowaileet <wa...@gmail.com>
> >>>>> wrote:
> >>>>>
> >>>>> Hi Ian and Pouria,
> >>>>>>
> >>>>>> The name of the files along with the sizes (there were 625 one of
> >>>>>> those
> >>>>>> before crashing):
> >>>>>>
> >>>>>> size        name
> >>>>>> 96MB     ExternalSortRunGenerator8917133039835449370.waf
> >>>>>> 128MB   ExternalSortRunGenerator8948724728025392343.waf
> >>>>>>
> >>>>>> no files were generated beyond runs.
> >>>>>> compiler.sortmemory = 64MB
> >>>>>>
> >>>>>> Here is the full logs
> >>>>>> <https://www.dropbox.com/s/k2qbo3wybc8mnnk/log_Thu_Aug_
> >>>>>>
> >>>>>> 25_07%3A34%3A52_AST_2016.zip?dl=0>
> >>>>>
> >>>>> On Tue, Aug 23, 2016 at 9:29 PM, Pouria Pirzadeh <
> >>>>>>
> >>>>>> pouria.pirzadeh@gmail.com>
> >>>>>
> >>>>> wrote:
> >>>>>>
> >>>>>> We previously had issues with huge spilled sort temp files when
> >>>>>> creating
> >>>>>>
> >>>>>>> inverted index for fuzzy queries, but NOT R-Trees.
> >>>>>>> I also recall that Yingyi fixed the issue of delaying clean-up for
> >>>>>>> intermediate temp files until the end of the query execution.
> >>>>>>> If you can share names of a couple of temp files (and their sizes
> >>>>>>> along
> >>>>>>> with the sort memory setting you have in asterix-configuration.xml)
> >>>>>>> we
> >>>>>>>
> >>>>>>> may
> >>>>>> be able to have a better guess as if the sort is really going into a
> >>>>>>
> >>>>>>> two-level merge or not.
> >>>>>>>
> >>>>>>> Pouria
> >>>>>>>
> >>>>>>> On Tue, Aug 23, 2016 at 11:09 AM, Ian Maxon <im...@uci.edu>
> wrote:
> >>>>>>>
> >>>>>>> I think that execption ("No space left on device") is just casted
> >>>>>>> from
> >>>>>>> the
> >>>>>>>
> >>>>>>> native IOException. Therefore I would be inclined to believe it's
> >>>>>>>>
> >>>>>>>> genuinely
> >>>>>>>
> >>>>>>> out of space. I suppose the question is why the external sort is so
> >>>>>>>>
> >>>>>>>> huge.
> >>>>>>>
> >>>>>> What is the query plan? Maybe that will shed light on a possible
> >>>>>> cause.
> >>>>>>
> >>>>>>> On Tue, Aug 23, 2016 at 9:59 AM, Wail Alkowaileet <
> >>>>>>>> wael.y.k@gmail.com
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>> I was monitoring Inodes ... it didn't go beyond 1%.
> >>>>>>>>
> >>>>>>>>> On Tue, Aug 23, 2016 at 7:58 PM, Wail Alkowaileet <
> >>>>>>>>> wael.y.k@gmail.com
> >>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>> Hi Chris and Mike,
> >>>>>>>>>
> >>>>>>>>>> Actually I was monitoring it to see what's going on:
> >>>>>>>>>>
> >>>>>>>>>>     - The size of each partition is about 40GB (80GB in total
> per
> >>>>>>>>>>     iodevice).
> >>>>>>>>>>     - The runs took 157GB per iodevice (about 2x of the dataset
> >>>>>>>>>> size).
> >>>>>>>>>>     Each run takes either of 128MB or 96MB of storage.
> >>>>>>>>>>     - At a certain time, there were 522 runs.
> >>>>>>>>>>
> >>>>>>>>>> I even tried to create a BTree Index to see if that happens as
> >>>>>>>>>> well.
> >>>>>>>>>>
> >>>>>>>>>> I
> >>>>>>>>>
> >>>>>>>> created two BTree indexes one for the *location* and one for the
> >>>>>>>>
> >>>>>>>>> *caller
> >>>>>>>>> *and
> >>>>>>>>>
> >>>>>>>>> they were created successfully. The sizes of the runs didn't take
> >>>>>>>>>>
> >>>>>>>>>> anyway
> >>>>>>>>> near that.
> >>>>>>>>>
> >>>>>>>>>> Logs are attached.
> >>>>>>>>>>
> >>>>>>>>>> On Tue, Aug 23, 2016 at 7:19 PM, Mike Carey <dt...@gmail.com>
> >>>>>>>>>>
> >>>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>> I think we might have "file GC issues" - I vaguely remember that
> we
> >>>>>>>>
> >>>>>>>>> don't
> >>>>>>>>>> (or at least didn't once upon a time) proactively remove
> >>>>>>>>>> unnecessary
> >>>>>>>>>> run
> >>>>>>>>>>
> >>>>>>>>> files - removing all of them at end-of-job instead of at the end
> of
> >>>>>>>>>
> >>>>>>>>>> the
> >>>>>>>>>>
> >>>>>>>>> execution phase that uses their contents.  We may also have an
> >>>>>>>>>
> >>>>>>>>>> "Amdahl
> >>>>>>>>>>
> >>>>>>>>> problem" right now with our sort since we serialize phase two of
> >>>>>>>>
> >>>>>>>>> parallel
> >>>>>>>>>> sorts - though this is not a query, it's index build, so that
> >>>>>>>>>> shouldn't
> >>>>>>>>>>
> >>>>>>>>> be
> >>>>>>>>>
> >>>>>>>>> it.  It would be interesting to put a df/sleep script on each of
> >>>>>>>>>> the
> >>>>>>>>>> nodes
> >>>>>>>>>> when this is happening - actually a script that monitors the
> temp
> >>>>>>>>>> file
> >>>>>>>>>>
> >>>>>>>>> directory - and watch the lifecycle happen and the sizes
> change....
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On 8/23/16 2:06 AM, Chris Hillery wrote:
> >>>>>>>>>>>
> >>>>>>>>>>> When you get the "disk full" warning, do a quick "df -i" on the
> >>>>>>>>>>> device
> >>>>>>>>>>>
> >>>>>>>>>> -
> >>>>>>>>>
> >>>>>>>>> possibly you've run out of inodes even if the space isn't all
> used
> >>>>>>>>>>
> >>>>>>>>>>> up.
> >>>>>>>>>>>
> >>>>>>>>>> It's
> >>>>>>>>>
> >>>>>>>>>> unlikely because I don't think AsterixDB creates a bunch of
> small
> >>>>>>>>>>>>
> >>>>>>>>>>>> files,
> >>>>>>>>>>>
> >>>>>>>>>> but worth checking.
> >>>>>>>>>>
> >>>>>>>>>>> If that's not it, then can you share the full exception and
> stack
> >>>>>>>>>>>>
> >>>>>>>>>>>> trace?
> >>>>>>>>>>>
> >>>>>>>>>> Ceej
> >>>>>>>>>>
> >>>>>>>>>>> aka Chris Hillery
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Tue, Aug 23, 2016 at 1:59 AM, Wail Alkowaileet <
> >>>>>>>>>>>>
> >>>>>>>>>>>> wael.y.k@gmail.com>
> >>>>>>>>>>>
> >>>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> I just cleared the hard drives to get 80% free space. I still
> get
> >>>>>>>>>>>>
> >>>>>>>>>>>> the
> >>>>>>>>>>>
> >>>>>>>>>> same
> >>>>>>>>
> >>>>>>>>> issue.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> The data contains:
> >>>>>>>>>>>>> 1- 2887453794 records.
> >>>>>>>>>>>>> 2- Schema:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> create type CDRType as {
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> id:uuid,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> 'date':string,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> 'time':string,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> 'duration':int64,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> 'caller':int64,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> 'callee':int64,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> location:point?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> }
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Tue, Aug 23, 2016 at 9:06 AM, Wail Alkowaileet <
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> wael.y.k@gmail.com
> >>>>>>>>>>>>
> >>>>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> Dears,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I have a dataset of size 290GB loaded in a 3 NCs each of
> which
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> has
> >>>>>>>>>>>>>
> >>>>>>>>>>>> 2x500GB
> >>>>>>>>
> >>>>>>>>> SSD.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> Each of NC has two IODevices (partitions) in each hard drive
> >>>>>>>>>>>>>> (i.e
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> the
> >>>>>>>>>>>>>
> >>>>>>>>>>>> total is 4 iodevices per NC). After loading the data, each
> >>>>>>>>>
> >>>>>>>>>> Asterix
> >>>>>>>>>>>>>
> >>>>>>>>>>>> partition occupied 31GB.
> >>>>>>>>
> >>>>>>>>> The cluster has about 50% free space in each hard drive
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> (approximately
> >>>>>>>>>>>>>
> >>>>>>>>>>>> about 250GB free space in each hard drive). However, when I
> >>>>>>>>>> tried
> >>>>>>>>>>
> >>>>>>>>>>> to
> >>>>>>>>>>>>>
> >>>>>>>>>>>> create
> >>>>>>>>>
> >>>>>>>>>> an index of type RTree, I got an exception that no space left in
> >>>>>>>>>>>>> the
> >>>>>>>>>>>>>
> >>>>>>>>>>>> hard
> >>>>>>>>>
> >>>>>>>>>> drive during the External Sort phase.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Is that normal ?
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> *Regards,*
> >>>>>>>>>>>>>> Wail Alkowaileet
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> *Regards,*
> >>>>>>>>>>>>> Wail Alkowaileet
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> --
> >>>>>>>>>>
> >>>>>>>>>> *Regards,*
> >>>>>>>>>> Wail Alkowaileet
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> --
> >>>>>>>>>
> >>>>>>>>> *Regards,*
> >>>>>>>>> Wail Alkowaileet
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> --
> >>>>>>
> >>>>>> *Regards,*
> >>>>>> Wail Alkowaileet
> >>>>>>
> >>>>>>
> >>>>> Best,
> >>>>>
> >>>>> Jianfeng Jia
> >>>>> PhD Candidate of Computer Science
> >>>>> University of California, Irvine
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >
>
>
> --
>
> *Regards,*
> Wail Alkowaileet
>

Re: Creating RTree: no space left

Posted by Wail Alkowaileet <wa...@gmail.com>.

To be exact
I have 2,255,091,590 records and 10,391 points :-)

On Wed, Sep 14, 2016 at 10:46 AM, Mike Carey <dt...@gmail.com> wrote:

> Thx!  I knew I'd meant to "activate" the thought somehow, but couldn't
> remember having done it for sure.  Oops! Scattered from VLDB, I guess...!
>
>
>
> On 9/13/16 9:58 PM, Taewoo Kim wrote:
>
>> @Mike: You filed an issue -
>> https://issues.apache.org/jira/browse/ASTERIXDB-1639. :-)
>>
>> Best,
>> Taewoo
>>
>> On Tue, Sep 13, 2016 at 9:28 PM, Mike Carey <dt...@gmail.com> wrote:
>>
>> I can't remember (slight jetlag? :-)) if I shared back to this list one
>>> theory that came up in India when Wail and I talked F2F - his data has a
>>> lot of duplicate points, so maybe something goes awry in that case.  I
>>> wonder if we've sufficiently tested that case?  (E.g., what if there are
>>> gazillions of records originating from a small handful of points?)
>>>
>>>
>>> On 8/26/16 9:55 AM, Taewoo Kim wrote:
>>>
>>> Based on a rough calculation, per partition, each point field takes 3.6GB
>>>> (16 bytes * 2887453794 records / 12 partition). To sort 3.6GB, we are
>>>> generating 625 files (96MB or 128MB each) = 157GB. Since Wail mentioned
>>>> that there was no issue when creating a B+ tree index, we need to check
>>>> what SORT process is required by R-Tree index.
>>>>
>>>> Best,
>>>> Taewoo
>>>>
>>>> On Fri, Aug 26, 2016 at 7:52 AM, Jianfeng Jia <ji...@gmail.com>
>>>> wrote:
>>>>
>>>> If all of the file names start with “ExternalSortRunGenerator”, then
>>>> they
>>>>
>>>>> are the first round files which can not be GCed.
>>>>> Could you provide the query plan as well?
>>>>>
>>>>> On Aug 24, 2016, at 10:02 PM, Wail Alkowaileet <wa...@gmail.com>
>>>>> wrote:
>>>>>
>>>>> Hi Ian and Pouria,
>>>>>>
>>>>>> The name of the files along with the sizes (there were 625 one of
>>>>>> those
>>>>>> before crashing):
>>>>>>
>>>>>> size        name
>>>>>> 96MB     ExternalSortRunGenerator8917133039835449370.waf
>>>>>> 128MB   ExternalSortRunGenerator8948724728025392343.waf
>>>>>>
>>>>>> no files were generated beyond runs.
>>>>>> compiler.sortmemory = 64MB
>>>>>>
>>>>>> Here is the full logs
>>>>>> <https://www.dropbox.com/s/k2qbo3wybc8mnnk/log_Thu_Aug_
>>>>>>
>>>>>> 25_07%3A34%3A52_AST_2016.zip?dl=0>
>>>>>
>>>>> On Tue, Aug 23, 2016 at 9:29 PM, Pouria Pirzadeh <
>>>>>>
>>>>>> pouria.pirzadeh@gmail.com>
>>>>>
>>>>> wrote:
>>>>>>
>>>>>> We previously had issues with huge spilled sort temp files when
>>>>>> creating
>>>>>>
>>>>>>> inverted index for fuzzy queries, but NOT R-Trees.
>>>>>>> I also recall that Yingyi fixed the issue of delaying clean-up for
>>>>>>> intermediate temp files until the end of the query execution.
>>>>>>> If you can share names of a couple of temp files (and their sizes
>>>>>>> along
>>>>>>> with the sort memory setting you have in asterix-configuration.xml)
>>>>>>> we
>>>>>>>
>>>>>>> may
>>>>>> be able to have a better guess as if the sort is really going into a
>>>>>>
>>>>>>> two-level merge or not.
>>>>>>>
>>>>>>> Pouria
>>>>>>>
>>>>>>> On Tue, Aug 23, 2016 at 11:09 AM, Ian Maxon <im...@uci.edu> wrote:
>>>>>>>
>>>>>>> I think that execption ("No space left on device") is just casted
>>>>>>> from
>>>>>>> the
>>>>>>>
>>>>>>> native IOException. Therefore I would be inclined to believe it's
>>>>>>>>
>>>>>>>> genuinely
>>>>>>>
>>>>>>> out of space. I suppose the question is why the external sort is so
>>>>>>>>
>>>>>>>> huge.
>>>>>>>
>>>>>> What is the query plan? Maybe that will shed light on a possible
>>>>>> cause.
>>>>>>
>>>>>>> On Tue, Aug 23, 2016 at 9:59 AM, Wail Alkowaileet <
>>>>>>>> wael.y.k@gmail.com
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> I was monitoring Inodes ... it didn't go beyond 1%.
>>>>>>>>
>>>>>>>>> On Tue, Aug 23, 2016 at 7:58 PM, Wail Alkowaileet <
>>>>>>>>> wael.y.k@gmail.com
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Hi Chris and Mike,
>>>>>>>>>
>>>>>>>>>> Actually I was monitoring it to see what's going on:
>>>>>>>>>>
>>>>>>>>>>     - The size of each partition is about 40GB (80GB in total per
>>>>>>>>>>     iodevice).
>>>>>>>>>>     - The runs took 157GB per iodevice (about 2x of the dataset
>>>>>>>>>> size).
>>>>>>>>>>     Each run takes either of 128MB or 96MB of storage.
>>>>>>>>>>     - At a certain time, there were 522 runs.
>>>>>>>>>>
>>>>>>>>>> I even tried to create a BTree Index to see if that happens as
>>>>>>>>>> well.
>>>>>>>>>>
>>>>>>>>>> I
>>>>>>>>>
>>>>>>>> created two BTree indexes one for the *location* and one for the
>>>>>>>>
>>>>>>>>> *caller
>>>>>>>>> *and
>>>>>>>>>
>>>>>>>>> they were created successfully. The sizes of the runs didn't take
>>>>>>>>>>
>>>>>>>>>> anyway
>>>>>>>>> near that.
>>>>>>>>>
>>>>>>>>>> Logs are attached.
>>>>>>>>>>
>>>>>>>>>> On Tue, Aug 23, 2016 at 7:19 PM, Mike Carey <dt...@gmail.com>
>>>>>>>>>>
>>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>> I think we might have "file GC issues" - I vaguely remember that we
>>>>>>>>
>>>>>>>>> don't
>>>>>>>>>> (or at least didn't once upon a time) proactively remove
>>>>>>>>>> unnecessary
>>>>>>>>>> run
>>>>>>>>>>
>>>>>>>>> files - removing all of them at end-of-job instead of at the end of
>>>>>>>>>
>>>>>>>>>> the
>>>>>>>>>>
>>>>>>>>> execution phase that uses their contents.  We may also have an
>>>>>>>>>
>>>>>>>>>> "Amdahl
>>>>>>>>>>
>>>>>>>>> problem" right now with our sort since we serialize phase two of
>>>>>>>>
>>>>>>>>> parallel
>>>>>>>>>> sorts - though this is not a query, it's index build, so that
>>>>>>>>>> shouldn't
>>>>>>>>>>
>>>>>>>>> be
>>>>>>>>>
>>>>>>>>> it.  It would be interesting to put a df/sleep script on each of
>>>>>>>>>> the
>>>>>>>>>> nodes
>>>>>>>>>> when this is happening - actually a script that monitors the temp
>>>>>>>>>> file
>>>>>>>>>>
>>>>>>>>> directory - and watch the lifecycle happen and the sizes change....
>>>>>>>>
>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 8/23/16 2:06 AM, Chris Hillery wrote:
>>>>>>>>>>>
>>>>>>>>>>> When you get the "disk full" warning, do a quick "df -i" on the
>>>>>>>>>>> device
>>>>>>>>>>>
>>>>>>>>>> -
>>>>>>>>>
>>>>>>>>> possibly you've run out of inodes even if the space isn't all used
>>>>>>>>>>
>>>>>>>>>>> up.
>>>>>>>>>>>
>>>>>>>>>> It's
>>>>>>>>>
>>>>>>>>>> unlikely because I don't think AsterixDB creates a bunch of small
>>>>>>>>>>>>
>>>>>>>>>>>> files,
>>>>>>>>>>>
>>>>>>>>>> but worth checking.
>>>>>>>>>>
>>>>>>>>>>> If that's not it, then can you share the full exception and stack
>>>>>>>>>>>>
>>>>>>>>>>>> trace?
>>>>>>>>>>>
>>>>>>>>>> Ceej
>>>>>>>>>>
>>>>>>>>>>> aka Chris Hillery
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, Aug 23, 2016 at 1:59 AM, Wail Alkowaileet <
>>>>>>>>>>>>
>>>>>>>>>>>> wael.y.k@gmail.com>
>>>>>>>>>>>
>>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> I just cleared the hard drives to get 80% free space. I still get
>>>>>>>>>>>>
>>>>>>>>>>>> the
>>>>>>>>>>>
>>>>>>>>>> same
>>>>>>>>
>>>>>>>>> issue.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The data contains:
>>>>>>>>>>>>> 1- 2887453794 records.
>>>>>>>>>>>>> 2- Schema:
>>>>>>>>>>>>>
>>>>>>>>>>>>> create type CDRType as {
>>>>>>>>>>>>>
>>>>>>>>>>>>> id:uuid,
>>>>>>>>>>>>>
>>>>>>>>>>>>> 'date':string,
>>>>>>>>>>>>>
>>>>>>>>>>>>> 'time':string,
>>>>>>>>>>>>>
>>>>>>>>>>>>> 'duration':int64,
>>>>>>>>>>>>>
>>>>>>>>>>>>> 'caller':int64,
>>>>>>>>>>>>>
>>>>>>>>>>>>> 'callee':int64,
>>>>>>>>>>>>>
>>>>>>>>>>>>> location:point?
>>>>>>>>>>>>>
>>>>>>>>>>>>> }
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Tue, Aug 23, 2016 at 9:06 AM, Wail Alkowaileet <
>>>>>>>>>>>>>
>>>>>>>>>>>>> wael.y.k@gmail.com
>>>>>>>>>>>>
>>>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Dears,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I have a dataset of size 290GB loaded in a 3 NCs each of which
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> has
>>>>>>>>>>>>>
>>>>>>>>>>>> 2x500GB
>>>>>>>>
>>>>>>>>> SSD.
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Each of NC has two IODevices (partitions) in each hard drive
>>>>>>>>>>>>>> (i.e
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>
>>>>>>>>>>>> total is 4 iodevices per NC). After loading the data, each
>>>>>>>>>
>>>>>>>>>> Asterix
>>>>>>>>>>>>>
>>>>>>>>>>>> partition occupied 31GB.
>>>>>>>>
>>>>>>>>> The cluster has about 50% free space in each hard drive
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> (approximately
>>>>>>>>>>>>>
>>>>>>>>>>>> about 250GB free space in each hard drive). However, when I
>>>>>>>>>> tried
>>>>>>>>>>
>>>>>>>>>>> to
>>>>>>>>>>>>>
>>>>>>>>>>>> create
>>>>>>>>>
>>>>>>>>>> an index of type RTree, I got an exception that no space left in
>>>>>>>>>>>>> the
>>>>>>>>>>>>>
>>>>>>>>>>>> hard
>>>>>>>>>
>>>>>>>>>> drive during the External Sort phase.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Is that normal ?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> *Regards,*
>>>>>>>>>>>>>> Wail Alkowaileet
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>
>>>>>>>>>>>>> *Regards,*
>>>>>>>>>>>>> Wail Alkowaileet
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>
>>>>>>>>>> *Regards,*
>>>>>>>>>> Wail Alkowaileet
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>
>>>>>>>>> *Regards,*
>>>>>>>>> Wail Alkowaileet
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>
>>>>>> *Regards,*
>>>>>> Wail Alkowaileet
>>>>>>
>>>>>>
>>>>> Best,
>>>>>
>>>>> Jianfeng Jia
>>>>> PhD Candidate of Computer Science
>>>>> University of California, Irvine
>>>>>
>>>>>
>>>>>
>>>>>
>


-- 

*Regards,*
Wail Alkowaileet

Re: Creating RTree: no space left

Posted by Mike Carey <dt...@gmail.com>.

Thx!  I knew I'd meant to "activate" the thought somehow, but couldn't 
remember having done it for sure.  Oops! Scattered from VLDB, I guess...!


On 9/13/16 9:58 PM, Taewoo Kim wrote:
> @Mike: You filed an issue -
> https://issues.apache.org/jira/browse/ASTERIXDB-1639. :-)
>
> Best,
> Taewoo
>
> On Tue, Sep 13, 2016 at 9:28 PM, Mike Carey <dt...@gmail.com> wrote:
>
>> I can't remember (slight jetlag? :-)) if I shared back to this list one
>> theory that came up in India when Wail and I talked F2F - his data has a
>> lot of duplicate points, so maybe something goes awry in that case.  I
>> wonder if we've sufficiently tested that case?  (E.g., what if there are
>> gazillions of records originating from a small handful of points?)
>>
>>
>> On 8/26/16 9:55 AM, Taewoo Kim wrote:
>>
>>> Based on a rough calculation, per partition, each point field takes 3.6GB
>>> (16 bytes * 2887453794 records / 12 partition). To sort 3.6GB, we are
>>> generating 625 files (96MB or 128MB each) = 157GB. Since Wail mentioned
>>> that there was no issue when creating a B+ tree index, we need to check
>>> what SORT process is required by R-Tree index.
>>>
>>> Best,
>>> Taewoo
>>>
>>> On Fri, Aug 26, 2016 at 7:52 AM, Jianfeng Jia <ji...@gmail.com>
>>> wrote:
>>>
>>> If all of the file names start with \u201cExternalSortRunGenerator\u201d, then they
>>>> are the first round files which can not be GCed.
>>>> Could you provide the query plan as well?
>>>>
>>>> On Aug 24, 2016, at 10:02 PM, Wail Alkowaileet <wa...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi Ian and Pouria,
>>>>>
>>>>> The name of the files along with the sizes (there were 625 one of those
>>>>> before crashing):
>>>>>
>>>>> size        name
>>>>> 96MB     ExternalSortRunGenerator8917133039835449370.waf
>>>>> 128MB   ExternalSortRunGenerator8948724728025392343.waf
>>>>>
>>>>> no files were generated beyond runs.
>>>>> compiler.sortmemory = 64MB
>>>>>
>>>>> Here is the full logs
>>>>> <https://www.dropbox.com/s/k2qbo3wybc8mnnk/log_Thu_Aug_
>>>>>
>>>> 25_07%3A34%3A52_AST_2016.zip?dl=0>
>>>>
>>>>> On Tue, Aug 23, 2016 at 9:29 PM, Pouria Pirzadeh <
>>>>>
>>>> pouria.pirzadeh@gmail.com>
>>>>
>>>>> wrote:
>>>>>
>>>>> We previously had issues with huge spilled sort temp files when creating
>>>>>> inverted index for fuzzy queries, but NOT R-Trees.
>>>>>> I also recall that Yingyi fixed the issue of delaying clean-up for
>>>>>> intermediate temp files until the end of the query execution.
>>>>>> If you can share names of a couple of temp files (and their sizes along
>>>>>> with the sort memory setting you have in asterix-configuration.xml) we
>>>>>>
>>>>> may
>>>>> be able to have a better guess as if the sort is really going into a
>>>>>> two-level merge or not.
>>>>>>
>>>>>> Pouria
>>>>>>
>>>>>> On Tue, Aug 23, 2016 at 11:09 AM, Ian Maxon <im...@uci.edu> wrote:
>>>>>>
>>>>>> I think that execption ("No space left on device") is just casted from
>>>>>> the
>>>>>>
>>>>>>> native IOException. Therefore I would be inclined to believe it's
>>>>>>>
>>>>>> genuinely
>>>>>>
>>>>>>> out of space. I suppose the question is why the external sort is so
>>>>>>>
>>>>>> huge.
>>>>> What is the query plan? Maybe that will shed light on a possible cause.
>>>>>>> On Tue, Aug 23, 2016 at 9:59 AM, Wail Alkowaileet <wael.y.k@gmail.com
>>>>>>> wrote:
>>>>>>>
>>>>>>> I was monitoring Inodes ... it didn't go beyond 1%.
>>>>>>>> On Tue, Aug 23, 2016 at 7:58 PM, Wail Alkowaileet <
>>>>>>>> wael.y.k@gmail.com
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Hi Chris and Mike,
>>>>>>>>> Actually I was monitoring it to see what's going on:
>>>>>>>>>
>>>>>>>>>     - The size of each partition is about 40GB (80GB in total per
>>>>>>>>>     iodevice).
>>>>>>>>>     - The runs took 157GB per iodevice (about 2x of the dataset
>>>>>>>>> size).
>>>>>>>>>     Each run takes either of 128MB or 96MB of storage.
>>>>>>>>>     - At a certain time, there were 522 runs.
>>>>>>>>>
>>>>>>>>> I even tried to create a BTree Index to see if that happens as well.
>>>>>>>>>
>>>>>>>> I
>>>>>>> created two BTree indexes one for the *location* and one for the
>>>>>>>> *caller
>>>>>>>> *and
>>>>>>>>
>>>>>>>>> they were created successfully. The sizes of the runs didn't take
>>>>>>>>>
>>>>>>>> anyway
>>>>>>>> near that.
>>>>>>>>> Logs are attached.
>>>>>>>>>
>>>>>>>>> On Tue, Aug 23, 2016 at 7:19 PM, Mike Carey <dt...@gmail.com>
>>>>>>>>>
>>>>>>>> wrote:
>>>>>>> I think we might have "file GC issues" - I vaguely remember that we
>>>>>>>>> don't
>>>>>>>>> (or at least didn't once upon a time) proactively remove unnecessary
>>>>>>>>> run
>>>>>>>> files - removing all of them at end-of-job instead of at the end of
>>>>>>>>> the
>>>>>>>> execution phase that uses their contents.  We may also have an
>>>>>>>>> "Amdahl
>>>>>>> problem" right now with our sort since we serialize phase two of
>>>>>>>>> parallel
>>>>>>>>> sorts - though this is not a query, it's index build, so that
>>>>>>>>> shouldn't
>>>>>>>> be
>>>>>>>>
>>>>>>>>> it.  It would be interesting to put a df/sleep script on each of the
>>>>>>>>> nodes
>>>>>>>>> when this is happening - actually a script that monitors the temp
>>>>>>>>> file
>>>>>>> directory - and watch the lifecycle happen and the sizes change....
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 8/23/16 2:06 AM, Chris Hillery wrote:
>>>>>>>>>>
>>>>>>>>>> When you get the "disk full" warning, do a quick "df -i" on the
>>>>>>>>>> device
>>>>>>>> -
>>>>>>>>
>>>>>>>>> possibly you've run out of inodes even if the space isn't all used
>>>>>>>>>> up.
>>>>>>>> It's
>>>>>>>>>>> unlikely because I don't think AsterixDB creates a bunch of small
>>>>>>>>>>>
>>>>>>>>>> files,
>>>>>>>>> but worth checking.
>>>>>>>>>>> If that's not it, then can you share the full exception and stack
>>>>>>>>>>>
>>>>>>>>>> trace?
>>>>>>>>> Ceej
>>>>>>>>>>> aka Chris Hillery
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Aug 23, 2016 at 1:59 AM, Wail Alkowaileet <
>>>>>>>>>>>
>>>>>>>>>> wael.y.k@gmail.com>
>>>>>>>> wrote:
>>>>>>>>>>> I just cleared the hard drives to get 80% free space. I still get
>>>>>>>>>>>
>>>>>>>>>> the
>>>>>>> same
>>>>>>>>>>>> issue.
>>>>>>>>>>>>
>>>>>>>>>>>> The data contains:
>>>>>>>>>>>> 1- 2887453794 records.
>>>>>>>>>>>> 2- Schema:
>>>>>>>>>>>>
>>>>>>>>>>>> create type CDRType as {
>>>>>>>>>>>>
>>>>>>>>>>>> id:uuid,
>>>>>>>>>>>>
>>>>>>>>>>>> 'date':string,
>>>>>>>>>>>>
>>>>>>>>>>>> 'time':string,
>>>>>>>>>>>>
>>>>>>>>>>>> 'duration':int64,
>>>>>>>>>>>>
>>>>>>>>>>>> 'caller':int64,
>>>>>>>>>>>>
>>>>>>>>>>>> 'callee':int64,
>>>>>>>>>>>>
>>>>>>>>>>>> location:point?
>>>>>>>>>>>>
>>>>>>>>>>>> }
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, Aug 23, 2016 at 9:06 AM, Wail Alkowaileet <
>>>>>>>>>>>>
>>>>>>>>>>> wael.y.k@gmail.com
>>>>>>>> wrote:
>>>>>>>>>>>> Dears,
>>>>>>>>>>>>
>>>>>>>>>>>>> I have a dataset of size 290GB loaded in a 3 NCs each of which
>>>>>>>>>>>>>
>>>>>>>>>>>> has
>>>>>>> 2x500GB
>>>>>>>>>>>> SSD.
>>>>>>>>>>>>> Each of NC has two IODevices (partitions) in each hard drive
>>>>>>>>>>>>> (i.e
>>>>>>>>>>>>>
>>>>>>>>>>>> the
>>>>>>>> total is 4 iodevices per NC). After loading the data, each
>>>>>>>>>>>> Asterix
>>>>>>> partition occupied 31GB.
>>>>>>>>>>>>> The cluster has about 50% free space in each hard drive
>>>>>>>>>>>>>
>>>>>>>>>>>> (approximately
>>>>>>>>> about 250GB free space in each hard drive). However, when I tried
>>>>>>>>>>>> to
>>>>>>>> create
>>>>>>>>>>>> an index of type RTree, I got an exception that no space left in
>>>>>>>>>>>> the
>>>>>>>> hard
>>>>>>>>>>>>> drive during the External Sort phase.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Is that normal ?
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>>
>>>>>>>>>>>>> *Regards,*
>>>>>>>>>>>>> Wail Alkowaileet
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>> *Regards,*
>>>>>>>>>>>> Wail Alkowaileet
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>> --
>>>>>>>>>
>>>>>>>>> *Regards,*
>>>>>>>>> Wail Alkowaileet
>>>>>>>>>
>>>>>>>>>
>>>>>>>> --
>>>>>>>>
>>>>>>>> *Regards,*
>>>>>>>> Wail Alkowaileet
>>>>>>>>
>>>>>>>>
>>>>> --
>>>>>
>>>>> *Regards,*
>>>>> Wail Alkowaileet
>>>>>
>>>>
>>>> Best,
>>>>
>>>> Jianfeng Jia
>>>> PhD Candidate of Computer Science
>>>> University of California, Irvine
>>>>
>>>>
>>>>

Re: Creating RTree: no space left

Posted by Taewoo Kim <wa...@gmail.com>.

@Mike: You filed an issue -
https://issues.apache.org/jira/browse/ASTERIXDB-1639. :-)

Best,
Taewoo

On Tue, Sep 13, 2016 at 9:28 PM, Mike Carey <dt...@gmail.com> wrote:

> I can't remember (slight jetlag? :-)) if I shared back to this list one
> theory that came up in India when Wail and I talked F2F - his data has a
> lot of duplicate points, so maybe something goes awry in that case.  I
> wonder if we've sufficiently tested that case?  (E.g., what if there are
> gazillions of records originating from a small handful of points?)
>
>
> On 8/26/16 9:55 AM, Taewoo Kim wrote:
>
>> Based on a rough calculation, per partition, each point field takes 3.6GB
>> (16 bytes * 2887453794 records / 12 partition). To sort 3.6GB, we are
>> generating 625 files (96MB or 128MB each) = 157GB. Since Wail mentioned
>> that there was no issue when creating a B+ tree index, we need to check
>> what SORT process is required by R-Tree index.
>>
>> Best,
>> Taewoo
>>
>> On Fri, Aug 26, 2016 at 7:52 AM, Jianfeng Jia <ji...@gmail.com>
>> wrote:
>>
>> If all of the file names start with “ExternalSortRunGenerator”, then they
>>> are the first round files which can not be GCed.
>>> Could you provide the query plan as well?
>>>
>>> On Aug 24, 2016, at 10:02 PM, Wail Alkowaileet <wa...@gmail.com>
>>>>
>>> wrote:
>>>
>>>> Hi Ian and Pouria,
>>>>
>>>> The name of the files along with the sizes (there were 625 one of those
>>>> before crashing):
>>>>
>>>> size        name
>>>> 96MB     ExternalSortRunGenerator8917133039835449370.waf
>>>> 128MB   ExternalSortRunGenerator8948724728025392343.waf
>>>>
>>>> no files were generated beyond runs.
>>>> compiler.sortmemory = 64MB
>>>>
>>>> Here is the full logs
>>>> <https://www.dropbox.com/s/k2qbo3wybc8mnnk/log_Thu_Aug_
>>>>
>>> 25_07%3A34%3A52_AST_2016.zip?dl=0>
>>>
>>>> On Tue, Aug 23, 2016 at 9:29 PM, Pouria Pirzadeh <
>>>>
>>> pouria.pirzadeh@gmail.com>
>>>
>>>> wrote:
>>>>
>>>> We previously had issues with huge spilled sort temp files when creating
>>>>> inverted index for fuzzy queries, but NOT R-Trees.
>>>>> I also recall that Yingyi fixed the issue of delaying clean-up for
>>>>> intermediate temp files until the end of the query execution.
>>>>> If you can share names of a couple of temp files (and their sizes along
>>>>> with the sort memory setting you have in asterix-configuration.xml) we
>>>>>
>>>> may
>>>
>>>> be able to have a better guess as if the sort is really going into a
>>>>> two-level merge or not.
>>>>>
>>>>> Pouria
>>>>>
>>>>> On Tue, Aug 23, 2016 at 11:09 AM, Ian Maxon <im...@uci.edu> wrote:
>>>>>
>>>>> I think that execption ("No space left on device") is just casted from
>>>>>>
>>>>> the
>>>>>
>>>>>> native IOException. Therefore I would be inclined to believe it's
>>>>>>
>>>>> genuinely
>>>>>
>>>>>> out of space. I suppose the question is why the external sort is so
>>>>>>
>>>>> huge.
>>>
>>>> What is the query plan? Maybe that will shed light on a possible cause.
>>>>>>
>>>>>> On Tue, Aug 23, 2016 at 9:59 AM, Wail Alkowaileet <wael.y.k@gmail.com
>>>>>> >
>>>>>> wrote:
>>>>>>
>>>>>> I was monitoring Inodes ... it didn't go beyond 1%.
>>>>>>>
>>>>>>> On Tue, Aug 23, 2016 at 7:58 PM, Wail Alkowaileet <
>>>>>>> wael.y.k@gmail.com
>>>>>>> wrote:
>>>>>>>
>>>>>>> Hi Chris and Mike,
>>>>>>>>
>>>>>>>> Actually I was monitoring it to see what's going on:
>>>>>>>>
>>>>>>>>    - The size of each partition is about 40GB (80GB in total per
>>>>>>>>    iodevice).
>>>>>>>>    - The runs took 157GB per iodevice (about 2x of the dataset
>>>>>>>> size).
>>>>>>>>    Each run takes either of 128MB or 96MB of storage.
>>>>>>>>    - At a certain time, there were 522 runs.
>>>>>>>>
>>>>>>>> I even tried to create a BTree Index to see if that happens as well.
>>>>>>>>
>>>>>>> I
>>>>>
>>>>>> created two BTree indexes one for the *location* and one for the
>>>>>>>>
>>>>>>> *caller
>>>>>>
>>>>>>> *and
>>>>>>>
>>>>>>>> they were created successfully. The sizes of the runs didn't take
>>>>>>>>
>>>>>>> anyway
>>>>>>
>>>>>>> near that.
>>>>>>>>
>>>>>>>> Logs are attached.
>>>>>>>>
>>>>>>>> On Tue, Aug 23, 2016 at 7:19 PM, Mike Carey <dt...@gmail.com>
>>>>>>>>
>>>>>>> wrote:
>>>>>
>>>>>> I think we might have "file GC issues" - I vaguely remember that we
>>>>>>>>>
>>>>>>>> don't
>>>>>>>
>>>>>>>> (or at least didn't once upon a time) proactively remove unnecessary
>>>>>>>>>
>>>>>>>> run
>>>>>>
>>>>>>> files - removing all of them at end-of-job instead of at the end of
>>>>>>>>>
>>>>>>>> the
>>>>>>
>>>>>>> execution phase that uses their contents.  We may also have an
>>>>>>>>>
>>>>>>>> "Amdahl
>>>>>
>>>>>> problem" right now with our sort since we serialize phase two of
>>>>>>>>>
>>>>>>>> parallel
>>>>>>>
>>>>>>>> sorts - though this is not a query, it's index build, so that
>>>>>>>>>
>>>>>>>> shouldn't
>>>>>>
>>>>>>> be
>>>>>>>
>>>>>>>> it.  It would be interesting to put a df/sleep script on each of the
>>>>>>>>>
>>>>>>>> nodes
>>>>>>>
>>>>>>>> when this is happening - actually a script that monitors the temp
>>>>>>>>>
>>>>>>>> file
>>>>>
>>>>>> directory - and watch the lifecycle happen and the sizes change....
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 8/23/16 2:06 AM, Chris Hillery wrote:
>>>>>>>>>
>>>>>>>>> When you get the "disk full" warning, do a quick "df -i" on the
>>>>>>>>>>
>>>>>>>>> device
>>>>>>
>>>>>>> -
>>>>>>>
>>>>>>>> possibly you've run out of inodes even if the space isn't all used
>>>>>>>>>>
>>>>>>>>> up.
>>>>>>
>>>>>>> It's
>>>>>>>>>> unlikely because I don't think AsterixDB creates a bunch of small
>>>>>>>>>>
>>>>>>>>> files,
>>>>>>>
>>>>>>>> but worth checking.
>>>>>>>>>>
>>>>>>>>>> If that's not it, then can you share the full exception and stack
>>>>>>>>>>
>>>>>>>>> trace?
>>>>>>>
>>>>>>>> Ceej
>>>>>>>>>> aka Chris Hillery
>>>>>>>>>>
>>>>>>>>>> On Tue, Aug 23, 2016 at 1:59 AM, Wail Alkowaileet <
>>>>>>>>>>
>>>>>>>>> wael.y.k@gmail.com>
>>>>>>
>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> I just cleared the hard drives to get 80% free space. I still get
>>>>>>>>>>
>>>>>>>>> the
>>>>>
>>>>>> same
>>>>>>>>>>> issue.
>>>>>>>>>>>
>>>>>>>>>>> The data contains:
>>>>>>>>>>> 1- 2887453794 records.
>>>>>>>>>>> 2- Schema:
>>>>>>>>>>>
>>>>>>>>>>> create type CDRType as {
>>>>>>>>>>>
>>>>>>>>>>> id:uuid,
>>>>>>>>>>>
>>>>>>>>>>> 'date':string,
>>>>>>>>>>>
>>>>>>>>>>> 'time':string,
>>>>>>>>>>>
>>>>>>>>>>> 'duration':int64,
>>>>>>>>>>>
>>>>>>>>>>> 'caller':int64,
>>>>>>>>>>>
>>>>>>>>>>> 'callee':int64,
>>>>>>>>>>>
>>>>>>>>>>> location:point?
>>>>>>>>>>>
>>>>>>>>>>> }
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Aug 23, 2016 at 9:06 AM, Wail Alkowaileet <
>>>>>>>>>>>
>>>>>>>>>> wael.y.k@gmail.com
>>>>>>
>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Dears,
>>>>>>>>>>>
>>>>>>>>>>>> I have a dataset of size 290GB loaded in a 3 NCs each of which
>>>>>>>>>>>>
>>>>>>>>>>> has
>>>>>
>>>>>> 2x500GB
>>>>>>>>>>>
>>>>>>>>>>> SSD.
>>>>>>>>>>>>
>>>>>>>>>>>> Each of NC has two IODevices (partitions) in each hard drive
>>>>>>>>>>>> (i.e
>>>>>>>>>>>>
>>>>>>>>>>> the
>>>>>>
>>>>>>> total is 4 iodevices per NC). After loading the data, each
>>>>>>>>>>>>
>>>>>>>>>>> Asterix
>>>>>
>>>>>> partition occupied 31GB.
>>>>>>>>>>>>
>>>>>>>>>>>> The cluster has about 50% free space in each hard drive
>>>>>>>>>>>>
>>>>>>>>>>> (approximately
>>>>>>>
>>>>>>>> about 250GB free space in each hard drive). However, when I tried
>>>>>>>>>>>>
>>>>>>>>>>> to
>>>>>>
>>>>>>> create
>>>>>>>>>>>
>>>>>>>>>>> an index of type RTree, I got an exception that no space left in
>>>>>>>>>>>>
>>>>>>>>>>> the
>>>>>>
>>>>>>> hard
>>>>>>>>>>>> drive during the External Sort phase.
>>>>>>>>>>>>
>>>>>>>>>>>> Is that normal ?
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>>
>>>>>>>>>>>> *Regards,*
>>>>>>>>>>>> Wail Alkowaileet
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>
>>>>>>>>>>> *Regards,*
>>>>>>>>>>> Wail Alkowaileet
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>> --
>>>>>>>>
>>>>>>>> *Regards,*
>>>>>>>> Wail Alkowaileet
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>> *Regards,*
>>>>>>> Wail Alkowaileet
>>>>>>>
>>>>>>>
>>>>
>>>> --
>>>>
>>>> *Regards,*
>>>> Wail Alkowaileet
>>>>
>>>
>>>
>>> Best,
>>>
>>> Jianfeng Jia
>>> PhD Candidate of Computer Science
>>> University of California, Irvine
>>>
>>>
>>>
>