You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@carbondata.apache.org by GitBox <gi...@apache.org> on 2019/10/24 10:27:14 UTC

[GitHub] [carbondata] ajantha-bhat commented on issue #3311: [WIP] arrow vector push down

ajantha-bhat commented on issue #3311: [WIP] arrow vector push down
URL: https://github.com/apache/carbondata/pull/3311#issuecomment-545854296

@jackylk : yes, Already Arrow is supported in carbondata SDK. so carbondata can integrate with other languages like python. **This PR is for perfromance improvement**

Current Arrow integration with carbon, support complex type and primitive type. And conversion from carbonInternalRow to Arrow vector happens at top layer. If Arrow vector is filled while rows are read from blocklet. One conversion of CarbonInternalRow can be avoided. which will improve performance a bit and it will be proper integration.

However, Current spark vector doesn't support complex type, so if arrow vectors are pushed down. Arrow also will stop supporting complex type. For a small performance improvement we lose functionality.

So, need to support complex type filling in columnarBatch (vector) first. Then this PR should go.
As we have complex column reading from presto requirement in carbon 2.0, that requirement will handle this problem. After that requirement is done. My PR can be merged.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

With regards,
Apache Git Services