You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@accumulo.apache.org by Denis <de...@camfex.cz> on 2013/05/25 19:44:10 UTC

TableOperations.getSplits and location information

Hi

Is there any reason why TableOperations.getSplits() does not expose
the location information (the information about tablet-to-tserver
correspondence) ?
It has this information internally and then just drops it.

This information can be useful to perform scans a bit smarter (to
maximize parallelism while avoiding overloading a single tserver) and
it also can be provided to Spark RDD.

Yes, I can call MetadataTable.getEntries() instead of
TableOperations.getSplits(), but, first, this API looks more
'internal' and, second, it needs the copy of credentials, thus making
the credential management a bit complicated.

Re: TableOperations.getSplits and location information

Posted by Christopher <ct...@apache.org>.

Mainly it doesn't because, when the API was added, that was not a
requirement. If another API, to inquire about locality, would be
useful, feel free to create a new feature ticket in JIRA to argue in
favor of such a feature, or submit a patch:
https://issues.apache.org/jira/browse/ACCUMULO

--
Christopher L Tubbs II
http://gravatar.com/ctubbsii


On Sat, May 25, 2013 at 1:44 PM, Denis <de...@camfex.cz> wrote:
> Hi
>
> Is there any reason why TableOperations.getSplits() does not expose
> the location information (the information about tablet-to-tserver
> correspondence) ?
> It has this information internally and then just drops it.
>
> This information can be useful to perform scans a bit smarter (to
> maximize parallelism while avoiding overloading a single tserver) and
> it also can be provided to Spark RDD.
>
> Yes, I can call MetadataTable.getEntries() instead of
> TableOperations.getSplits(), but, first, this API looks more
> 'internal' and, second, it needs the copy of credentials, thus making
> the credential management a bit complicated.