You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2021/07/16 16:27:32 UTC

[GitHub] [incubator-pinot] mcvsubbu commented on pull request #7167: Bump up data table version to V4 to optimize space usage.

mcvsubbu commented on pull request #7167:
URL: https://github.com/apache/incubator-pinot/pull/7167#issuecomment-881570533


   > > > since presto connector will access pinot server with segment query and wrap up DataTable by itself. Is it possible to make the query itself contain the version of which DataTable protocol to use?
   > > > So the Pinot server can work with v3 broker/v4 broker or v2 presto.
   > > 
   > > 
   > > @xiangfu0 @mayankshriv
   > > We will eventually stop supporting V2 and V3, so it's better to update codebase of presto connector to follow up the most recent pinot. But I do think it a good idea to to allow client specify DataTable version, this will make client happy: client only need make minimal change after pinot is upgraded( by specifying an old DataTable version), this will give client developer more time to upgrade their code. Let's pull @mcvsubbu @siddharthteotia to chime in.
   > 
   > @mqliang We do not own the Presto code. On top of that, there are two of them PrestoDB and Trino. And is being used in production at Uber and City Storage Systems. Also, we are still dealing with the v3 change that was recently made.
   > 
   > FWIW, I think either DataTable should not have been exposed outside of Pinot, or it should have been treated as an external interface that should not be changed in backward incompatible way. But given that we are in the situation, we should probably acknowledge that and avoid making further backward incompatible changes until we resolve the situation across the communities.
   > 
   > Thoughts?
   
   Totally agree that we should not have exposed DataTable to begin with.  I believe we do need this optimization, so adding something in the protocol (or, maybe adding a capability handshake in the beginning?) is workable.  Note that adding this in the beginning will mean adding a config to the broker to go ahead and use v2,v3,v4 in its requests. Since we upgrade brokers first, this will also mean  for the server to look (for each request) which version it needs to return with, and exception if it does not support that version (for a robust protocol design). It is also possible for the server to indicate its support in zookeeper under its instance config, and the broker can cache it on a per server basis along with the routing table. Either handshake or zk seems to be a workable solution, probably better than adding something in every query path.
   
   Can we also consider adding an spi interface for datatable, and have the trino/presto community make a one-time move to use that instead? Over time, we can support the concept of a session, and retain such things at session level rather than every query.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org