You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@carbondata.apache.org by Ajantha Bhat <aj...@gmail.com> on 2020/02/12 05:33:12 UTC

Regarding presto carbondata integration

Hi all,

Currently master code of carbondata works with *prestodb 0.217*
We all know about competing *presto-sql* also.
Some of the users doesn't want to migrate to *presto-sql *as their cloud
vendor doesn't support presto sql (Example, AWS EMR, Huawei MRS, AZURE
services except HDInsights still comes with *presto db*)

So,
1. carbondata need to support both of them ?
2. carbondata need to maintain two modules ? one for prestodb, one for
prestosql, may be need to extract common code (big effort)
3. At a time carbondata can support only version of prestodb and
presto-sql. Every 15 days they release version and our integration is not
based on SPI (not as stand alone connector), we extended hive connector
interface. so, every few releases, carbondata and presto integration code
need to modify. This can be a bigger problem for maintenance.

And this is about read support, when we handle write support need to take
care about all the above points.

Thanks,
Ajantha

Re: Regarding presto carbondata integration

Posted by akashrn5 <ak...@gmail.com>.
Hi Ajantha,

Whatever you mentioned is a big pain point now. Even when we are try for
write support, the hadoop and hive versions supported 
in carbon version is different from what presto supports, so we might have
to have duplicate code for this case also. Either we have to 
put carbon code in presto, which might take time, or we may have to put in
extra effort to refactor the presto integration code based on versions,
like we had for different spark versions.

Regards,
Akash



--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: Regarding presto carbondata integration

Posted by Jacky Li <ja...@qq.com>.

> 2020年2月12日 下午1:33,Ajantha Bhat <aj...@gmail.com> 写道:
> 
> Hi all,
> 
> Currently master code of carbondata works with *prestodb 0.217*
> We all know about competing *presto-sql* also.
> Some of the users doesn't want to migrate to *presto-sql *as their cloud
> vendor doesn't support presto sql (Example, AWS EMR, Huawei MRS, AZURE
> services except HDInsights still comes with *presto db*)
> 
> So,
> 1. carbondata need to support both of them ?

Yes, I think some user start to use prestosql already. PrestoSQL community is also quite active.


> 2. carbondata need to maintain two modules ? one for prestodb, one for
> prestosql, may be need to extract common code (big effort)

Yes, I am thinking the same after trying to adapter to PrestoSQL in last Dec. PrestoSQL has changed package name of some class. But most of our code should be in common for PrestoSQL and PrestoDB


> 3. At a time carbondata can support only version of prestodb and
> presto-sql. Every 15 days they release version and our integration is not
> based on SPI (not as stand alone connector), we extended hive connector
> interface. so, every few releases, carbondata and presto integration code
> need to modify. This can be a bigger problem for maintenance.

Have you analyzed if there is a way to use their formal developer API? This is indeed a problem for support future version smoothly

> 
> And this is about read support, when we handle write support need to take
> care about all the above points.
> 
> Thanks,
> Ajantha