You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@accumulo.apache.org by "Eric Newton (Commented) (JIRA)" <ji...@apache.org> on 2012/04/03 19:28:24 UTC

[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode

    [ https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13245520#comment-13245520 ] 

Eric Newton commented on ACCUMULO-118:
--------------------------------------

If we used viewfs, we would still need a more flexible way of choosing where in the file system we want to put a tablet.  Right now we use

{noformat}
configured root of everything / tables / tableId / generated id / generated id . file extension
{noformat}

Some possible implementations:

 * configure a particular table onto a namespace, by mounting another namespace at the tableId. But this would be difficult to predict and configure.
 * have a "namespace" property, on a per-table config, which would provide the root directory to use for the table. You could not ever  change it, though.  Also, it might be nice to have a table spread over multiple namespaces.
 * use a hashing technique to map names into different real namespaces.  But if we did this as a layer over other filesystems, we would need to perform a read against all nns in order to list the contents of a directory.  I'm not sure if we do very many directory listings, so maybe this wouldn't be worse.
 * pluggable component that would choose the filenames to use.  You could use hashes to distribute files, or choose based on namenode health, decommission status, etc.  Unfortunately, this would make the organization of the table's files less coherent.


                
> accumulo could work across HDFS instances, which would help it to scale past a single namenode
> ----------------------------------------------------------------------------------------------
>
>                 Key: ACCUMULO-118
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-118
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: master, tserver
>    Affects Versions: 1.5.0
>            Reporter: Eric Newton
>            Assignee: Eric Newton
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira