You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ignite.apache.org by Andrey Mashenkov <an...@gmail.com> on 2017/01/21 12:26:27 UTC
Optimize integer sets.
Hi Guys
Alexei Scherbakov report a ticket few time ago [1]. The solution look
promissing.
Alexei, you wrote that this can save some memory. More over replacing
linked Set structure to array based bit-set
can give a speed-up due to array based structures are cache friendly.
But one thing is not clear for me how we will handle sparsed bit-sets? For
example, if we have 1024 partiotions (as it is by default)
and have much nodes, e.g. 512. In this case, bit-set will occupy 256 bytes
that seem to be more than Set<Integer>.
What do you mean exactly to use bit-set as more compact structure then
Set<Integer> or bit-set with some additional compression?
I would thought, we can use hash-set with open addressing in some cases
like that to get gain of array bases structures over linked structures and
save memory?
For example, we could use such hash-set for small data (64bytes as cache
line size) and use bit-sets for bigger data, if it's possible of course.
Thoughts?
[1] https://issues.apache.org/jira/browse/IGNITE-4554
答复: Optimize integer sets.
Posted by Shawn Du <sh...@neulion.com.cn>.
Try this https://github.com/RoaringBitmap/RoaringBitmap
-----邮件原件-----
发件人: Sergi Vladykin [mailto:sergi.vladykin@gmail.com]
发送时间: 2017年1月21日 21:29
收件人: dev@ignite.apache.org
主题: Re: Optimize integer sets.
I'd suggest to have a single abstract class `Partitions` with protected constructor and static factory method. This will allow to add different optimized for any particular case implementations transparently.
Sergi
2017-01-21 15:26 GMT+03:00 Andrey Mashenkov <an...@gmail.com>:
> Hi Guys
>
> Alexei Scherbakov report a ticket few time ago [1]. The solution look
> promissing.
>
> Alexei, you wrote that this can save some memory. More over replacing
> linked Set structure to array based bit-set can give a speed-up due to
> array based structures are cache friendly.
>
> But one thing is not clear for me how we will handle sparsed bit-sets?
> For example, if we have 1024 partiotions (as it is by default) and
> have much nodes, e.g. 512. In this case, bit-set will occupy 256 bytes
> that seem to be more than Set<Integer>.
>
> What do you mean exactly to use bit-set as more compact structure then
> Set<Integer> or bit-set with some additional compression?
>
> I would thought, we can use hash-set with open addressing in some
> cases like that to get gain of array bases structures over linked
> structures and save memory?
> For example, we could use such hash-set for small data (64bytes as
> cache line size) and use bit-sets for bigger data, if it's possible of course.
>
>
> Thoughts?
>
> [1] https://issues.apache.org/jira/browse/IGNITE-4554
>
Re: Optimize integer sets.
Posted by Alexei Scherbakov <al...@gmail.com>.
Andrey,
IGNITE-4554 is about compressed bit sets, not standart bit sets.
Currently I'm implementing data structure based on [1]
It should efficiently handle sparse bit sets.
[1] http://roaringbitmap.org/
2017-01-21 16:28 GMT+03:00 Sergi Vladykin <se...@gmail.com>:
> I'd suggest to have a single abstract class `Partitions` with protected
> constructor and static factory method. This will allow to add different
> optimized for any particular case implementations transparently.
>
> Sergi
>
> 2017-01-21 15:26 GMT+03:00 Andrey Mashenkov <an...@gmail.com>:
>
> > Hi Guys
> >
> > Alexei Scherbakov report a ticket few time ago [1]. The solution look
> > promissing.
> >
> > Alexei, you wrote that this can save some memory. More over replacing
> > linked Set structure to array based bit-set
> > can give a speed-up due to array based structures are cache friendly.
> >
> > But one thing is not clear for me how we will handle sparsed bit-sets?
> For
> > example, if we have 1024 partiotions (as it is by default)
> > and have much nodes, e.g. 512. In this case, bit-set will occupy 256
> bytes
> > that seem to be more than Set<Integer>.
> >
> > What do you mean exactly to use bit-set as more compact structure then
> > Set<Integer> or bit-set with some additional compression?
> >
> > I would thought, we can use hash-set with open addressing in some cases
> > like that to get gain of array bases structures over linked structures
> and
> > save memory?
> > For example, we could use such hash-set for small data (64bytes as cache
> > line size) and use bit-sets for bigger data, if it's possible of course.
> >
> >
> > Thoughts?
> >
> > [1] https://issues.apache.org/jira/browse/IGNITE-4554
> >
>
--
Best regards,
Alexei Scherbakov
Re: Optimize integer sets.
Posted by Sergi Vladykin <se...@gmail.com>.
I'd suggest to have a single abstract class `Partitions` with protected
constructor and static factory method. This will allow to add different
optimized for any particular case implementations transparently.
Sergi
2017-01-21 15:26 GMT+03:00 Andrey Mashenkov <an...@gmail.com>:
> Hi Guys
>
> Alexei Scherbakov report a ticket few time ago [1]. The solution look
> promissing.
>
> Alexei, you wrote that this can save some memory. More over replacing
> linked Set structure to array based bit-set
> can give a speed-up due to array based structures are cache friendly.
>
> But one thing is not clear for me how we will handle sparsed bit-sets? For
> example, if we have 1024 partiotions (as it is by default)
> and have much nodes, e.g. 512. In this case, bit-set will occupy 256 bytes
> that seem to be more than Set<Integer>.
>
> What do you mean exactly to use bit-set as more compact structure then
> Set<Integer> or bit-set with some additional compression?
>
> I would thought, we can use hash-set with open addressing in some cases
> like that to get gain of array bases structures over linked structures and
> save memory?
> For example, we could use such hash-set for small data (64bytes as cache
> line size) and use bit-sets for bigger data, if it's possible of course.
>
>
> Thoughts?
>
> [1] https://issues.apache.org/jira/browse/IGNITE-4554
>