You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by Kevin Lyda <ke...@ie.suberic.net> on 2013/04/03 15:06:08 UTC

On disk layout of HDFS...

I've been following https://issues.apache.org/jira/browse/HDFS-1312
and really need the balancing tool described therein. I'd be
interested in writing it, but am not sure where to start. I'm more
comfortable in Python, but I suspect it has a better chance of being
integrated if I do it in Java.

Is hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop the
place to look for interfaces to manipulate the filesystem?

Kevin

--
Kevin Lyda
Galway, Ireland
US Citizen overseas? We can vote.
Register now: http://www.votefromabroad.org/

Re: On disk layout of HDFS...

Posted by Kevin Lyda <ke...@ie.suberic.net>.
1804 looks promising for the longer term. But yes, I suspect a
rebalancer would be useful. I'll work on an offline one first.

Kevin

On Thu, Apr 4, 2013 at 10:45 AM, Harsh J <ha...@cloudera.com> wrote:
> If you're looking for the online solution, Aaron's just posted a
> working implementation of it at
> https://issues.apache.org/jira/browse/HDFS-1804.
>
> For the offline or asynchronous disk balancer discussed by
> https://issues.apache.org/jira/browse/HDFS-1312, if you want your tool
> to be part of the upstream project, I'd encourage first posting your
> design for vetting/comments followed by the implementation, so that
> all finer points get covered. The offline tool is the easiest to
> write, and can also exist in Python (outside of HDFS, hosted over some
> GitHub repo perhaps) as it doesn't really have to work with the DN or
> NN's protocol calls. Understanding the block data directory structure
> (ls -l one of your dfs.data.dirs/dfs.datanode.data.dirs and follow)
> should let you write one up easily.
>
> On Wed, Apr 3, 2013 at 6:36 PM, Kevin Lyda <ke...@ie.suberic.net> wrote:
>> I've been following https://issues.apache.org/jira/browse/HDFS-1312
>> and really need the balancing tool described therein. I'd be
>> interested in writing it, but am not sure where to start. I'm more
>> comfortable in Python, but I suspect it has a better chance of being
>> integrated if I do it in Java.
>>
>> Is hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop the
>> place to look for interfaces to manipulate the filesystem?
>>
>> Kevin
>>
>> --
>> Kevin Lyda
>> Galway, Ireland
>> US Citizen overseas? We can vote.
>> Register now: http://www.votefromabroad.org/
>
>
>
> --
> Harsh J



-- 
Kevin Lyda
Galway, Ireland
US Citizen overseas? We can vote.
Register now: http://www.votefromabroad.org/

Re: On disk layout of HDFS...

Posted by Harsh J <ha...@cloudera.com>.
If you're looking for the online solution, Aaron's just posted a
working implementation of it at
https://issues.apache.org/jira/browse/HDFS-1804.

For the offline or asynchronous disk balancer discussed by
https://issues.apache.org/jira/browse/HDFS-1312, if you want your tool
to be part of the upstream project, I'd encourage first posting your
design for vetting/comments followed by the implementation, so that
all finer points get covered. The offline tool is the easiest to
write, and can also exist in Python (outside of HDFS, hosted over some
GitHub repo perhaps) as it doesn't really have to work with the DN or
NN's protocol calls. Understanding the block data directory structure
(ls -l one of your dfs.data.dirs/dfs.datanode.data.dirs and follow)
should let you write one up easily.

On Wed, Apr 3, 2013 at 6:36 PM, Kevin Lyda <ke...@ie.suberic.net> wrote:
> I've been following https://issues.apache.org/jira/browse/HDFS-1312
> and really need the balancing tool described therein. I'd be
> interested in writing it, but am not sure where to start. I'm more
> comfortable in Python, but I suspect it has a better chance of being
> integrated if I do it in Java.
>
> Is hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop the
> place to look for interfaces to manipulate the filesystem?
>
> Kevin
>
> --
> Kevin Lyda
> Galway, Ireland
> US Citizen overseas? We can vote.
> Register now: http://www.votefromabroad.org/



-- 
Harsh J