You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Dan Burkert (JIRA)" <ji...@apache.org> on 2016/04/07 18:54:25 UTC

[jira] [Commented] (KUDU-1401) Expose partition location information through Python API

    [ https://issues.apache.org/jira/browse/KUDU-1401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15230555#comment-15230555 ] 

Dan Burkert commented on KUDU-1401:
-----------------------------------

Hi [~mrocklin].  I'm actually working on a new API right now that may solve your needs.  There is an associated [design doc|https://github.com/danburkert/kudu/blob/scan-api-design/docs/design-docs/scan-tokens.md] with the motivation and high level overview.  A Java implementation is [already in review|http://gerrit.cloudera.org:8080/#/c/2592/].  The idea is to allow clients to create 'scan tokens' in a similar way that a normal scan would be created.  These tokens correspond to a contiguous physical portion of the table, and may be later turned into a scanner over that portion.  The tokens contain locality information, and can be easily serialized/deserialized.  Hopefully this will suit your needs?  Once the C++ implementation is done (working on it now), python bindings can be created.

> Expose partition location information through Python API
> --------------------------------------------------------
>
>                 Key: KUDU-1401
>                 URL: https://issues.apache.org/jira/browse/KUDU-1401
>             Project: Kudu
>          Issue Type: New Feature
>          Components: api, python
>            Reporter: Matthew Rocklin
>            Priority: Minor
>
> When building data local parallel applications it is often nice to know the physical location of blocks of data on the network so that, for each particular block of data we can try to load it into memory on a machine where it already exists on local disk.
> This API exists, I think, in the C++ layer but is not yet exposed, I think, through the Python API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)