You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ignite.apache.org by Alexey Kukushkin <ku...@gmail.com> on 2017/10/16 09:19:25 UTC

Re: integrate with prestodb

Cross-sending to the DEV community.

On Mon, Oct 16, 2017 at 12:14 PM, shawn.du <sh...@neulion.com.cn> wrote:

> Hi community,
>
> I am trying to implement a connector for presto to connect ignite.
> I think it will be a very interest thing to connect ignite and presto.
>
> In fact, currently we use ignite and it works very well.  but in order to
> save memory, we build compressed binary data.
> thus we cannot query them using SQL. We use ignite map-reduce to query the
> data.
>
> Using presto, we may use SQL again. If it is fast enough, ignite will be
> our in memory storage and not responsible for computing or only for simple
> query.
> The only thing I concern about is presto is fast enough or not like
> Ignite. For now all ignite query cost less than 5 seconds and most are
> hundreds of milliseconds.
> Also presto provides a connector for redis.  I don't know community has
> interest to contribute to presto-ignite?
>
> Thanks
> Shawn
>
>


-- 
Best regards,
Alexey

Re: integrate with prestodb

Posted by Denis Magda <dm...@gridgain.com>.
Shawn,

What's the purpose of Presto then if you consider holding all the data in
RAM? From what I see, Presto is intended for joining the data stored in
different storages.

As for the Ignite persistence, here are some performance hints you might
need to apply:
https://apacheignite.readme.io/docs/durable-memory-tuning

--
Denis


On Mon, Oct 16, 2017 at 7:42 PM, shawn.du <sh...@neulion.com.cn> wrote:

> Hi Denis,
>
> We are evaluating this feature(our production use ignite 1.9 and we are
> testing ignite 2.2),  this do make things simple. But we don't want to loss
> performance.
> we need careful testing, seeing from our first round test result, the disk
> IO will be the bottleneck.
> The load average is higher than ignite 1.9 without this feature.  Also i
> don't know ignite load data from disk will fast enough
> comparing with decode the data in memory.
>
> Thanks
> Shawn
>
> On 10/17/2017 10:25,Denis Magda<dm...@apache.org> <dm...@apache.org>
> wrote:
>
> Shawn,
>
> Then my suggestion would be to enable Ignite persistence [
> 1] that will store the whole data set you have. RAM will
> keep only a subset for the performance benefits. Ignite
> SQL is full supported for the persistence, you can even
> join data RAM and disk only data sets. Plus, your compression becomes optional.
>
>
> [1] https://ignite.apache.org/features/persistence.html <htt
> ps://ignite.apache.org/features/persistence.html>
>
> —
> Denis
>
> > On Oct 16, 2017, at 7:18 PM, shawn.du <sh...@neulion.com.cn> wrote:
> >
> > Hi Denis,
> >
> > Yes, We do want to limit the RAM to less than 64G.
> RAM resource is still an expensive resource.
> > If we store our data with ignite SQL queryable format,
> our data may use more than 640G. This is too expensive for us.
> > So we store data using binary format which works a
> bit like orc or parquet. Only several important columns are
> SQL queryable and the others are not. In this way, we do
> store using less RAMs, but we have to use map-reduce to
> query the data, which is a little bit of complex: Query
> in client with SQL, then submit jobs to ignite compute,
>  finally do some post aggregation in client.
> > This is why I want to have a try of Presto. We like SQL,
> we want all computation on server side.
> >
> > welcome your comments.
> >
> > Thanks
> > Shawn
> >
> > On 10/17/2017 07:57,Denis Magda<dm...@apache.org> <mailto:
> dmagda@apache.org> wrote:
> > Hello Shawn,
> >
> > Do I understand properly that you have scarce RAM
> resources and think to exploit Presto as an alternative SQL engine in
> Ignite that queries both RAM and disk data sets? If that’s
> the case than just enable Ignite native persistence [1]
> and you’ll get all the data stored on disk and as much as
> you can afford in RAM. The SQL works over both tiers transparently for you.
>
> >
> > [1] https://ignite.apache.org/features/persistence.html <
> https://ignite.apache.org/features/persistence.html> <ht
> tps://ignite.apache.org/features/persistence.html <htt
> ps://ignite.apache.org/features/persistence.html>>
> >
> > —
> > Denis
> >
> > > On Oct 16, 2017, at 2:19 AM, Alexey Kukushkin <kukushki
> nalexey@gmail.com <ma...@gmail.com>> wrote:
> > >
> > > Cross-sending to the DEV community.
> > >
> > > On Mon, Oct 16, 2017 at 12:14 PM, shawn.du <shawn.du@neulion.com.cn
>  <ma...@neulion.com.cn> <mailto:shawn.du@neulion.com.cn <mailto:
> shawn.du@neulion.com.cn>>> wrote:
> > > Hi community,
> > >
> > > I am trying to implement a connector for presto to connect ignite.
> > > I think it will be a very interest thing to connect ignite and presto.
>
> > >
> > > In fact, currently we use ignite and it works very well.
>   but in order to save memory, we build compressed binary data.
> > > thus we cannot query them using SQL. We use ignite map-reduce to query the data.
>
> > >
> > > Using presto, we may use SQL again. If it is fast
> enough, ignite will be our in memory storage and not
> responsible for computing or only for simple query.
> > > The only thing I concern about is presto is fast
> enough or not like Ignite. For now all ignite query cost
> less than 5 seconds and most are hundreds of milliseconds.
> > > Also presto provides a connector for redis.  I don't
> know community has interest to contribute to presto-ignite?
> > >
> > > Thanks
> > > Shawn
> > >
> > >
> > >
> > >
> > > --
> > > Best regards,
> > > Alexey
> >
>
>

Re: integrate with prestodb

Posted by Denis Magda <dm...@gridgain.com>.
Shawn,

What's the purpose of Presto then if you consider holding all the data in
RAM? From what I see, Presto is intended for joining the data stored in
different storages.

As for the Ignite persistence, here are some performance hints you might
need to apply:
https://apacheignite.readme.io/docs/durable-memory-tuning

--
Denis


On Mon, Oct 16, 2017 at 7:42 PM, shawn.du <sh...@neulion.com.cn> wrote:

> Hi Denis,
>
> We are evaluating this feature(our production use ignite 1.9 and we are
> testing ignite 2.2),  this do make things simple. But we don't want to loss
> performance.
> we need careful testing, seeing from our first round test result, the disk
> IO will be the bottleneck.
> The load average is higher than ignite 1.9 without this feature.  Also i
> don't know ignite load data from disk will fast enough
> comparing with decode the data in memory.
>
> Thanks
> Shawn
>
> On 10/17/2017 10:25,Denis Magda<dm...@apache.org> <dm...@apache.org>
> wrote:
>
> Shawn,
>
> Then my suggestion would be to enable Ignite persistence [
> 1] that will store the whole data set you have. RAM will
> keep only a subset for the performance benefits. Ignite
> SQL is full supported for the persistence, you can even
> join data RAM and disk only data sets. Plus, your compression becomes optional.
>
>
> [1] https://ignite.apache.org/features/persistence.html <htt
> ps://ignite.apache.org/features/persistence.html>
>
> —
> Denis
>
> > On Oct 16, 2017, at 7:18 PM, shawn.du <sh...@neulion.com.cn> wrote:
> >
> > Hi Denis,
> >
> > Yes, We do want to limit the RAM to less than 64G.
> RAM resource is still an expensive resource.
> > If we store our data with ignite SQL queryable format,
> our data may use more than 640G. This is too expensive for us.
> > So we store data using binary format which works a
> bit like orc or parquet. Only several important columns are
> SQL queryable and the others are not. In this way, we do
> store using less RAMs, but we have to use map-reduce to
> query the data, which is a little bit of complex: Query
> in client with SQL, then submit jobs to ignite compute,
>  finally do some post aggregation in client.
> > This is why I want to have a try of Presto. We like SQL,
> we want all computation on server side.
> >
> > welcome your comments.
> >
> > Thanks
> > Shawn
> >
> > On 10/17/2017 07:57,Denis Magda<dm...@apache.org> <mailto:
> dmagda@apache.org> wrote:
> > Hello Shawn,
> >
> > Do I understand properly that you have scarce RAM
> resources and think to exploit Presto as an alternative SQL engine in
> Ignite that queries both RAM and disk data sets? If that’s
> the case than just enable Ignite native persistence [1]
> and you’ll get all the data stored on disk and as much as
> you can afford in RAM. The SQL works over both tiers transparently for you.
>
> >
> > [1] https://ignite.apache.org/features/persistence.html <
> https://ignite.apache.org/features/persistence.html> <ht
> tps://ignite.apache.org/features/persistence.html <htt
> ps://ignite.apache.org/features/persistence.html>>
> >
> > —
> > Denis
> >
> > > On Oct 16, 2017, at 2:19 AM, Alexey Kukushkin <kukushki
> nalexey@gmail.com <ma...@gmail.com>> wrote:
> > >
> > > Cross-sending to the DEV community.
> > >
> > > On Mon, Oct 16, 2017 at 12:14 PM, shawn.du <shawn.du@neulion.com.cn
>  <ma...@neulion.com.cn> <mailto:shawn.du@neulion.com.cn <mailto:
> shawn.du@neulion.com.cn>>> wrote:
> > > Hi community,
> > >
> > > I am trying to implement a connector for presto to connect ignite.
> > > I think it will be a very interest thing to connect ignite and presto.
>
> > >
> > > In fact, currently we use ignite and it works very well.
>   but in order to save memory, we build compressed binary data.
> > > thus we cannot query them using SQL. We use ignite map-reduce to query the data.
>
> > >
> > > Using presto, we may use SQL again. If it is fast
> enough, ignite will be our in memory storage and not
> responsible for computing or only for simple query.
> > > The only thing I concern about is presto is fast
> enough or not like Ignite. For now all ignite query cost
> less than 5 seconds and most are hundreds of milliseconds.
> > > Also presto provides a connector for redis.  I don't
> know community has interest to contribute to presto-ignite?
> > >
> > > Thanks
> > > Shawn
> > >
> > >
> > >
> > >
> > > --
> > > Best regards,
> > > Alexey
> >
>
>

Re: integrate with prestodb

Posted by Denis Magda <dm...@apache.org>.
Shawn,

Then my suggestion would be to enable Ignite persistence [1] that will store the whole data set you have. RAM will keep only a subset for the performance benefits. Ignite SQL is full supported for the persistence, you can even join data RAM and disk only data sets. Plus, your compression becomes optional.

[1] https://ignite.apache.org/features/persistence.html <https://ignite.apache.org/features/persistence.html>

—
Denis

> On Oct 16, 2017, at 7:18 PM, shawn.du <sh...@neulion.com.cn> wrote:
> 
> Hi Denis,
> 
> Yes, We do want to limit the RAM to less than 64G.  RAM resource is still an expensive resource.
> If we store our data with ignite SQL queryable format, our data may use more than 640G. This is too expensive for us.
> So we store data using binary format which works a bit like orc or parquet. Only several important columns are SQL queryable and the others are not. In this way, we do store using less RAMs, but we have to use map-reduce to query the data, which is a little bit of complex: Query in client with SQL, then submit jobs to ignite compute, finally do some post aggregation in client.
> This is why I want to have a try of Presto. We like SQL, we want all computation on server side. 
> 
> welcome your comments.
> 
> Thanks
> Shawn
> 
> On 10/17/2017 07:57,Denis Magda<dm...@apache.org> <ma...@apache.org> wrote: 
> Hello Shawn, 
> 
> Do I understand properly that you have scarce RAM resources and think to exploit Presto as an alternative SQL engine in Ignite that queries both RAM and disk data sets? If that’s the case than just enable Ignite native persistence [1] and you’ll get all the data stored on disk and as much as you can afford in RAM. The SQL works over both tiers transparently for you. 
> 
> [1] https://ignite.apache.org/features/persistence.html <https://ignite.apache.org/features/persistence.html> <https://ignite.apache.org/features/persistence.html <https://ignite.apache.org/features/persistence.html>> 
> 
> — 
> Denis 
> 
> > On Oct 16, 2017, at 2:19 AM, Alexey Kukushkin <kukushkinalexey@gmail.com <ma...@gmail.com>> wrote: 
> >  
> > Cross-sending to the DEV community. 
> >  
> > On Mon, Oct 16, 2017 at 12:14 PM, shawn.du <shawn.du@neulion.com.cn <ma...@neulion.com.cn> <mailto:shawn.du@neulion.com.cn <ma...@neulion.com.cn>>> wrote: 
> > Hi community, 
> >  
> > I am trying to implement a connector for presto to connect ignite.  
> > I think it will be a very interest thing to connect ignite and presto. 
> >  
> > In fact, currently we use ignite and it works very well.  but in order to save memory, we build compressed binary data. 
> > thus we cannot query them using SQL. We use ignite map-reduce to query the data. 
> >  
> > Using presto, we may use SQL again. If it is fast enough, ignite will be our in memory storage and not responsible for computing or only for simple query. 
> > The only thing I concern about is presto is fast enough or not like Ignite. For now all ignite query cost less than 5 seconds and most are hundreds of milliseconds. 
> > Also presto provides a connector for redis.  I don't know community has interest to contribute to presto-ignite? 
> >  
> > Thanks 
> > Shawn 
> >  
> >  
> >  
> >  
> > --  
> > Best regards, 
> > Alexey 
> 


Re: integrate with prestodb

Posted by Denis Magda <dm...@apache.org>.
Shawn,

Then my suggestion would be to enable Ignite persistence [1] that will store the whole data set you have. RAM will keep only a subset for the performance benefits. Ignite SQL is full supported for the persistence, you can even join data RAM and disk only data sets. Plus, your compression becomes optional.

[1] https://ignite.apache.org/features/persistence.html <https://ignite.apache.org/features/persistence.html>

—
Denis

> On Oct 16, 2017, at 7:18 PM, shawn.du <sh...@neulion.com.cn> wrote:
> 
> Hi Denis,
> 
> Yes, We do want to limit the RAM to less than 64G.  RAM resource is still an expensive resource.
> If we store our data with ignite SQL queryable format, our data may use more than 640G. This is too expensive for us.
> So we store data using binary format which works a bit like orc or parquet. Only several important columns are SQL queryable and the others are not. In this way, we do store using less RAMs, but we have to use map-reduce to query the data, which is a little bit of complex: Query in client with SQL, then submit jobs to ignite compute, finally do some post aggregation in client.
> This is why I want to have a try of Presto. We like SQL, we want all computation on server side. 
> 
> welcome your comments.
> 
> Thanks
> Shawn
> 
> On 10/17/2017 07:57,Denis Magda<dm...@apache.org> <ma...@apache.org> wrote: 
> Hello Shawn, 
> 
> Do I understand properly that you have scarce RAM resources and think to exploit Presto as an alternative SQL engine in Ignite that queries both RAM and disk data sets? If that’s the case than just enable Ignite native persistence [1] and you’ll get all the data stored on disk and as much as you can afford in RAM. The SQL works over both tiers transparently for you. 
> 
> [1] https://ignite.apache.org/features/persistence.html <https://ignite.apache.org/features/persistence.html> <https://ignite.apache.org/features/persistence.html <https://ignite.apache.org/features/persistence.html>> 
> 
> — 
> Denis 
> 
> > On Oct 16, 2017, at 2:19 AM, Alexey Kukushkin <kukushkinalexey@gmail.com <ma...@gmail.com>> wrote: 
> >  
> > Cross-sending to the DEV community. 
> >  
> > On Mon, Oct 16, 2017 at 12:14 PM, shawn.du <shawn.du@neulion.com.cn <ma...@neulion.com.cn> <mailto:shawn.du@neulion.com.cn <ma...@neulion.com.cn>>> wrote: 
> > Hi community, 
> >  
> > I am trying to implement a connector for presto to connect ignite.  
> > I think it will be a very interest thing to connect ignite and presto. 
> >  
> > In fact, currently we use ignite and it works very well.  but in order to save memory, we build compressed binary data. 
> > thus we cannot query them using SQL. We use ignite map-reduce to query the data. 
> >  
> > Using presto, we may use SQL again. If it is fast enough, ignite will be our in memory storage and not responsible for computing or only for simple query. 
> > The only thing I concern about is presto is fast enough or not like Ignite. For now all ignite query cost less than 5 seconds and most are hundreds of milliseconds. 
> > Also presto provides a connector for redis.  I don't know community has interest to contribute to presto-ignite? 
> >  
> > Thanks 
> > Shawn 
> >  
> >  
> >  
> >  
> > --  
> > Best regards, 
> > Alexey 
> 


Re: integrate with prestodb

Posted by Denis Magda <dm...@apache.org>.
Hello Shawn,

Do I understand properly that you have scarce RAM resources and think to exploit Presto as an alternative SQL engine in Ignite that queries both RAM and disk data sets? If that’s the case than just enable Ignite native persistence [1] and you’ll get all the data stored on disk and as much as you can afford in RAM. The SQL works over both tiers transparently for you.

[1] https://ignite.apache.org/features/persistence.html <https://ignite.apache.org/features/persistence.html>

—
Denis

> On Oct 16, 2017, at 2:19 AM, Alexey Kukushkin <ku...@gmail.com> wrote:
> 
> Cross-sending to the DEV community.
> 
> On Mon, Oct 16, 2017 at 12:14 PM, shawn.du <shawn.du@neulion.com.cn <ma...@neulion.com.cn>> wrote:
> Hi community,
> 
> I am trying to implement a connector for presto to connect ignite. 
> I think it will be a very interest thing to connect ignite and presto.
> 
> In fact, currently we use ignite and it works very well.  but in order to save memory, we build compressed binary data.
> thus we cannot query them using SQL. We use ignite map-reduce to query the data.
> 
> Using presto, we may use SQL again. If it is fast enough, ignite will be our in memory storage and not responsible for computing or only for simple query.
> The only thing I concern about is presto is fast enough or not like Ignite. For now all ignite query cost less than 5 seconds and most are hundreds of milliseconds.
> Also presto provides a connector for redis.  I don't know community has interest to contribute to presto-ignite?
> 
> Thanks
> Shawn
> 
> 
> 
> 
> -- 
> Best regards,
> Alexey


Re: integrate with prestodb

Posted by Denis Magda <dm...@apache.org>.
Hello Shawn,

Do I understand properly that you have scarce RAM resources and think to exploit Presto as an alternative SQL engine in Ignite that queries both RAM and disk data sets? If that’s the case than just enable Ignite native persistence [1] and you’ll get all the data stored on disk and as much as you can afford in RAM. The SQL works over both tiers transparently for you.

[1] https://ignite.apache.org/features/persistence.html <https://ignite.apache.org/features/persistence.html>

—
Denis

> On Oct 16, 2017, at 2:19 AM, Alexey Kukushkin <ku...@gmail.com> wrote:
> 
> Cross-sending to the DEV community.
> 
> On Mon, Oct 16, 2017 at 12:14 PM, shawn.du <shawn.du@neulion.com.cn <ma...@neulion.com.cn>> wrote:
> Hi community,
> 
> I am trying to implement a connector for presto to connect ignite. 
> I think it will be a very interest thing to connect ignite and presto.
> 
> In fact, currently we use ignite and it works very well.  but in order to save memory, we build compressed binary data.
> thus we cannot query them using SQL. We use ignite map-reduce to query the data.
> 
> Using presto, we may use SQL again. If it is fast enough, ignite will be our in memory storage and not responsible for computing or only for simple query.
> The only thing I concern about is presto is fast enough or not like Ignite. For now all ignite query cost less than 5 seconds and most are hundreds of milliseconds.
> Also presto provides a connector for redis.  I don't know community has interest to contribute to presto-ignite?
> 
> Thanks
> Shawn
> 
> 
> 
> 
> -- 
> Best regards,
> Alexey