You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by Rita <rm...@gmail.com> on 2011/03/26 05:06:49 UTC

directory scan issue

Using 0.21

When I have a filesystem (XFS) with 1TB it detects the datanode detects it
immediately. When I create 3 identical file systems all 3TB are visible
immediately.

If I create a 6TB filesystem (XFS) and I add it to dfs.data.dir and I
restart the datanode, "hdfs dfsadmin -report" does not see the new 6TB
filesystem.

In all of these occasions the datanode does create a 'finalized' file
structure in respective directories?


My questions are:
Is there a limitation in the size of dfs.data.dir? What is the largest
filesystem that can be part of it?
Could this be a block scanner issue? Is it possible to make my block
scanning more aggressive?






-- 
--- Get your facts first, then you can distort them as you please.--

Re: directory scan issue

Posted by Rita <rm...@gmail.com>.
Thankyou. Switching to ext4

On Tue, Mar 29, 2011 at 8:23 AM, Eric <er...@gmail.com> wrote:

> Rita, another issue I've seen is that when you have lots of XFS filesystems
> that are heavily used, the Linux kernel will at some point crash. So the XFS
> driver seems to have problems that only appear with large volumes of data. I
> will switch to EXT4 soon because of this.
>
>
> 2011/3/29 Rita <rm...@gmail.com>
>
>> Thanks with ext4 i created 2 16TB volumes and they are seen. I think it
>> maybe a issue with XFS.
>>
>>
>>
>> On Mon, Mar 28, 2011 at 3:50 PM, Todd Lipcon <to...@cloudera.com> wrote:
>>
>>> On Fri, Mar 25, 2011 at 9:06 PM, Rita <rm...@gmail.com> wrote:
>>>
>>>> Using 0.21
>>>>
>>>> When I have a filesystem (XFS) with 1TB it detects the
>>>> datanode detects it immediately. When I create 3 identical file systems all
>>>> 3TB are visible immediately.
>>>>
>>>> If I create a 6TB filesystem (XFS) and I add it to dfs.data.dir and I
>>>> restart the datanode, "hdfs dfsadmin -report" does not see the new 6TB
>>>> filesystem.
>>>>
>>>> In all of these occasions the datanode does create a 'finalized' file
>>>> structure in respective directories?
>>>>
>>>>
>>>> My questions are:
>>>> Is there a limitation in the size of dfs.data.dir? What is the largest
>>>> filesystem that can be part of it?
>>>>
>>>
>>> I've heard that there is a 4T limit, but I've never tried to replicate.
>>> Given that single disks aren't this large, it indicates you might be running
>>> RAID or SPAN rather than recommended JBOD.
>>>
>>>
>>>> Could this be a block scanner issue? Is it possible to make my block
>>>> scanning more aggressive?
>>>>
>>>
>>> Unrelated to block scanning most likely.
>>>
>>> -Todd
>>>  --
>>> Todd Lipcon
>>> Software Engineer, Cloudera
>>>
>>
>>
>>
>> --
>> --- Get your facts first, then you can distort them as you please.--
>>
>
>


-- 
--- Get your facts first, then you can distort them as you please.--

Re: directory scan issue

Posted by Eric <er...@gmail.com>.
Rita, another issue I've seen is that when you have lots of XFS filesystems
that are heavily used, the Linux kernel will at some point crash. So the XFS
driver seems to have problems that only appear with large volumes of data. I
will switch to EXT4 soon because of this.

2011/3/29 Rita <rm...@gmail.com>

> Thanks with ext4 i created 2 16TB volumes and they are seen. I think it
> maybe a issue with XFS.
>
>
>
> On Mon, Mar 28, 2011 at 3:50 PM, Todd Lipcon <to...@cloudera.com> wrote:
>
>> On Fri, Mar 25, 2011 at 9:06 PM, Rita <rm...@gmail.com> wrote:
>>
>>> Using 0.21
>>>
>>> When I have a filesystem (XFS) with 1TB it detects the
>>> datanode detects it immediately. When I create 3 identical file systems all
>>> 3TB are visible immediately.
>>>
>>> If I create a 6TB filesystem (XFS) and I add it to dfs.data.dir and I
>>> restart the datanode, "hdfs dfsadmin -report" does not see the new 6TB
>>> filesystem.
>>>
>>> In all of these occasions the datanode does create a 'finalized' file
>>> structure in respective directories?
>>>
>>>
>>> My questions are:
>>> Is there a limitation in the size of dfs.data.dir? What is the largest
>>> filesystem that can be part of it?
>>>
>>
>> I've heard that there is a 4T limit, but I've never tried to replicate.
>> Given that single disks aren't this large, it indicates you might be running
>> RAID or SPAN rather than recommended JBOD.
>>
>>
>>> Could this be a block scanner issue? Is it possible to make my block
>>> scanning more aggressive?
>>>
>>
>> Unrelated to block scanning most likely.
>>
>> -Todd
>>  --
>> Todd Lipcon
>> Software Engineer, Cloudera
>>
>
>
>
> --
> --- Get your facts first, then you can distort them as you please.--
>

Re: directory scan issue

Posted by Rita <rm...@gmail.com>.
Thanks with ext4 i created 2 16TB volumes and they are seen. I think it
maybe a issue with XFS.



On Mon, Mar 28, 2011 at 3:50 PM, Todd Lipcon <to...@cloudera.com> wrote:

> On Fri, Mar 25, 2011 at 9:06 PM, Rita <rm...@gmail.com> wrote:
>
>> Using 0.21
>>
>> When I have a filesystem (XFS) with 1TB it detects the datanode detects it
>> immediately. When I create 3 identical file systems all 3TB are visible
>> immediately.
>>
>> If I create a 6TB filesystem (XFS) and I add it to dfs.data.dir and I
>> restart the datanode, "hdfs dfsadmin -report" does not see the new 6TB
>> filesystem.
>>
>> In all of these occasions the datanode does create a 'finalized' file
>> structure in respective directories?
>>
>>
>> My questions are:
>> Is there a limitation in the size of dfs.data.dir? What is the largest
>> filesystem that can be part of it?
>>
>
> I've heard that there is a 4T limit, but I've never tried to replicate.
> Given that single disks aren't this large, it indicates you might be running
> RAID or SPAN rather than recommended JBOD.
>
>
>> Could this be a block scanner issue? Is it possible to make my block
>> scanning more aggressive?
>>
>
> Unrelated to block scanning most likely.
>
> -Todd
> --
> Todd Lipcon
> Software Engineer, Cloudera
>



-- 
--- Get your facts first, then you can distort them as you please.--

Re: directory scan issue

Posted by Todd Lipcon <to...@cloudera.com>.
On Fri, Mar 25, 2011 at 9:06 PM, Rita <rm...@gmail.com> wrote:

> Using 0.21
>
> When I have a filesystem (XFS) with 1TB it detects the datanode detects it
> immediately. When I create 3 identical file systems all 3TB are visible
> immediately.
>
> If I create a 6TB filesystem (XFS) and I add it to dfs.data.dir and I
> restart the datanode, "hdfs dfsadmin -report" does not see the new 6TB
> filesystem.
>
> In all of these occasions the datanode does create a 'finalized' file
> structure in respective directories?
>
>
> My questions are:
> Is there a limitation in the size of dfs.data.dir? What is the largest
> filesystem that can be part of it?
>

I've heard that there is a 4T limit, but I've never tried to replicate.
Given that single disks aren't this large, it indicates you might be running
RAID or SPAN rather than recommended JBOD.


> Could this be a block scanner issue? Is it possible to make my block
> scanning more aggressive?
>

Unrelated to block scanning most likely.

-Todd
-- 
Todd Lipcon
Software Engineer, Cloudera