You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "dhruba borthakur (JIRA)" <ji...@apache.org> on 2007/03/29 00:14:25 UTC
[jira] Created: (HADOOP-1180) NNbench test should be able to test
the checksumfilesystem as well as the raw filesystem
NNbench test should be able to test the checksumfilesystem as well as the raw filesystem
----------------------------------------------------------------------------------------
Key: HADOOP-1180
URL: https://issues.apache.org/jira/browse/HADOOP-1180
Project: Hadoop
Issue Type: Bug
Components: dfs
Reporter: dhruba borthakur
Assigned To: dhruba borthakur
The NNbench test should have the option of testing a file system with checksums turned on and with checksums turned off. The original behaviour of nnbench test was to test hdfs without checksums.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-1180) NNbench test should be able to test
the checksumfilesystem as well as the raw filesystem
Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12485030 ]
Hadoop QA commented on HADOOP-1180:
-----------------------------------
+1, because http://issues.apache.org/jira/secure/attachment/12354470/nnbench.patch applied and successfully tested against trunk revision http://svn.apache.org/repos/asf/lucene/hadoop/trunk/523072. Results are at http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch
> NNbench test should be able to test the checksumfilesystem as well as the raw filesystem
> ----------------------------------------------------------------------------------------
>
> Key: HADOOP-1180
> URL: https://issues.apache.org/jira/browse/HADOOP-1180
> Project: Hadoop
> Issue Type: Bug
> Components: dfs
> Reporter: dhruba borthakur
> Assigned To: dhruba borthakur
> Attachments: nnbench.patch
>
>
> The NNbench test should have the option of testing a file system with checksums turned on and with checksums turned off. The original behaviour of nnbench test was to test hdfs without checksums.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-1180) NNbench test should be able to test
the checksumfilesystem as well as the raw filesystem
Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
dhruba borthakur updated HADOOP-1180:
-------------------------------------
Status: Patch Available (was: Open)
> NNbench test should be able to test the checksumfilesystem as well as the raw filesystem
> ----------------------------------------------------------------------------------------
>
> Key: HADOOP-1180
> URL: https://issues.apache.org/jira/browse/HADOOP-1180
> Project: Hadoop
> Issue Type: Bug
> Components: dfs
> Reporter: dhruba borthakur
> Assigned To: dhruba borthakur
> Attachments: nnbench.patch
>
>
> The NNbench test should have the option of testing a file system with checksums turned on and with checksums turned off. The original behaviour of nnbench test was to test hdfs without checksums.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
Re: [jira] Updated: (HADOOP-1180) NNbench test should be able to
test the checksumfilesystem as well as the raw filesystem
Posted by Doug Cutting <cu...@apache.org>.
Nigel Daley wrote:
> As you realized below, the test was using raw methods before
> HADOOP-928. I don't understand your reference to "undocumented" and
> "unsupported", but I'm not sure it matters.
The 'raw' methods were only intended to be used by FileSystem
implementations.
> One of the design goals of the test is to remove the effects of
> DataNodes as much as possible since this is a NameNode benchmark.
> That's why we used the raw methods (therefore no crc's). We run it with
> 1 byte files with 1 byte blocks with a replication factor of 1, all
> designed to maximize the load on the NameNode and minimize the effects
> of the DataNodes.
In this case, the current checksum implementation should simply double
the number of both namenode and datanode calls over raw calls. So it
should still be a fine benchmark, you just need to double the rates to
make them comparable. The new checksum implementation should be nearly
as fast as raw calls were.
Doug
Re: [jira] Updated: (HADOOP-1180) NNbench test should be able to test the checksumfilesystem as well as the raw filesystem
Posted by Nigel Daley <nd...@yahoo-inc.com>.
On Mar 29, 2007, at 12:07 PM, Doug Cutting wrote:
> Nigel Daley wrote:
>> So shouldn't fixing this test to conform to the new model in
>> HADOOP-1134 be the concern of the patch for HADOOP-1134?
>
> Yes, but, as it stands, this patch would silently stop working
> correctly once HADOOP-1134 is committed. It should instead be
> written in a more robust way, that can survive expected changes.
> Relying on HDFS using ChecksumFileSystem isn't as reliable as an
> explicit constructor that says "I want an unchecksummed FileSystem."
Ya, that's fine. I have no problem's changing the way the patch is
implemented.
>> As it stand, I can't run NNBench at scale without using a raw file
>> system, which is what this patch is intended to allow.
>
> It seems strange to disable things in an undocumented and
> unsupported way in order to get a benchmark to complete. How does
> that prove scalability? Rather, leaving NNBench alone seems like a
> strong argument for implementing HADOOP-1134 sooner.
As you realized below, the test was using raw methods before
HADOOP-928. I don't understand your reference to "undocumented" and
"unsupported", but I'm not sure it matters.
> Still, if you want to be able to disable checksums, for benchmarks
> or whatever, we can permit that, but should do so explicitly.
>
>> HADOOP-928 caused this test to use a ChecksumFileSystem and
>> subsequently we saw our "read" TPS metric plummet from 20,000 to a
>> couple hundred.
>
> Ah, NNBench used the 'raw' methods before, which was kind of sneaky
> on its part, since it didn't benchmark the typical user experience.
> Although the namenode performance should only halve at worst with
> checksums as currently implemented, no?
One of the design goals of the test is to remove the effects of
DataNodes as much as possible since this is a NameNode benchmark.
That's why we used the raw methods (therefore no crc's). We run it
with 1 byte files with 1 byte blocks with a replication factor of 1,
all designed to maximize the load on the NameNode and minimize the
effects of the DataNodes.
>> Let's get our current benchmark back on track before we commit
>> HADOOP-1134 (which will likely take a while before it is "Patch
>> Available").
>
> I'd argue that we should fix the benchmark to accurately reflect
> what users see, so that we see real improvement when HADOOP-1134 is
> committed. That would make it a more useful and realistic
> benchmark. However if you believe that a checksum-free benchmark is
> still useful, I think it should be more future-proof.
I think this is the crux of the misunderstanding. This is a NameNode
benchmark, not a DataNode benchmark nor a system benchmark. It
attempts to measure the TPS that are possible in the extreme.
I think you want a different kind of benchmark, which is fair. It's
just not this benchmark.
Cheers,
Nige
Re: [jira] Updated: (HADOOP-1180) NNbench test should be able to
test the checksumfilesystem as well as the raw filesystem
Posted by Doug Cutting <cu...@apache.org>.
Nigel Daley wrote:
> So shouldn't fixing this test to conform to the new model in HADOOP-1134
> be the concern of the patch for HADOOP-1134?
Yes, but, as it stands, this patch would silently stop working correctly
once HADOOP-1134 is committed. It should instead be written in a more
robust way, that can survive expected changes. Relying on HDFS using
ChecksumFileSystem isn't as reliable as an explicit constructor that
says "I want an unchecksummed FileSystem."
> As it stand, I can't run
> NNBench at scale without using a raw file system, which is what this
> patch is intended to allow.
It seems strange to disable things in an undocumented and unsupported
way in order to get a benchmark to complete. How does that prove
scalability? Rather, leaving NNBench alone seems like a strong argument
for implementing HADOOP-1134 sooner.
Still, if you want to be able to disable checksums, for benchmarks or
whatever, we can permit that, but should do so explicitly.
> HADOOP-928 caused this test to use a
> ChecksumFileSystem and subsequently we saw our "read" TPS metric plummet
> from 20,000 to a couple hundred.
Ah, NNBench used the 'raw' methods before, which was kind of sneaky on
its part, since it didn't benchmark the typical user experience.
Although the namenode performance should only halve at worst with
checksums as currently implemented, no?
> Let's get our current benchmark back on track before we commit
> HADOOP-1134 (which will likely take a while before it is "Patch
> Available").
I'd argue that we should fix the benchmark to accurately reflect what
users see, so that we see real improvement when HADOOP-1134 is
committed. That would make it a more useful and realistic benchmark.
However if you believe that a checksum-free benchmark is still useful, I
think it should be more future-proof.
Doug
Re: [jira] Updated: (HADOOP-1180) NNbench test should be able to
test the checksumfilesystem as well as the raw filesystem
Posted by Raghu Angadi <ra...@yahoo-inc.com>.
Doug Cutting wrote:
> hairong Kuang wrote:
>> 1. NNBench sets the block size to be 1. Althouth it generates a file with
>> only 1 byte, but the file's checksum file has 16 bytes (12 bytes header
>> plus 4 bytes checksums). Without the checksum file, only 1 block needs
>> to be
>> generated. With the checksum file, 17 blocks need to be generated. So the
>> overhead of generating a checksum file is huge in this special case.
>
> So to make this benchmark more representative of real performance with
> lots of small files, we should change the block size to 16 or greater.
> While small files may be typical, a blocksize of 1 is not. Since the
> benchmark only writes one byte per file, a tiny block size doesn't
> really need to be set at all for this benchmark.
+1. I was thinking of the same. We could even leave the block size to
default and create 1 byte files as we do now.
Raghu.
> Doug
Re: [jira] Updated: (HADOOP-1180) NNbench test should be able to
test the checksumfilesystem as well as the raw filesystem
Posted by Doug Cutting <cu...@apache.org>.
hairong Kuang wrote:
> 1. NNBench sets the block size to be 1. Althouth it generates a file with
> only 1 byte, but the file's checksum file has 16 bytes (12 bytes header
> plus 4 bytes checksums). Without the checksum file, only 1 block needs to be
> generated. With the checksum file, 17 blocks need to be generated. So the
> overhead of generating a checksum file is huge in this special case.
So to make this benchmark more representative of real performance with
lots of small files, we should change the block size to 16 or greater.
While small files may be typical, a blocksize of 1 is not. Since the
benchmark only writes one byte per file, a tiny block size doesn't
really need to be set at all for this benchmark.
Doug
Re: [jira] Updated: (HADOOP-1180) NNbench test should be able to
test the checksumfilesystem as well as the raw filesystem
Posted by Raghu Angadi <ra...@yahoo-inc.com>.
hairong Kuang wrote:
> Two main reasons caused the performance decrease:
>
> 1. NNBench sets the block size to be 1. Althouth it generates a file with
> only 1 byte, but the file's checksum file has 16 bytes (12 bytes header
> plus 4 bytes checksums). Without the checksum file, only 1 block needs to be
> generated. With the checksum file, 17 blocks need to be generated. So the
> overhead of generating a checksum file is huge in this special case.
> Hadoop-1134 should help a lot for this.
thanks. The numbers now make sense.
Raghu.
RE: [jira] Updated: (HADOOP-1180) NNbench test should be able to test the checksumfilesystem as well as the raw filesystem
Posted by hairong Kuang <ha...@yahoo-inc.com>.
Two main reasons caused the performance decrease:
1. NNBench sets the block size to be 1. Althouth it generates a file with
only 1 byte, but the file's checksum file has 16 bytes (12 bytes header
plus 4 bytes checksums). Without the checksum file, only 1 block needs to be
generated. With the checksum file, 17 blocks need to be generated. So the
overhead of generating a checksum file is huge in this special case.
Hadoop-1134 should help a lot for this.
2. NotReplicatedYetException occures only when a file has more than 1 block.
Because the checksum file has 16 blocks, it receives
NotReplicatedYetException. The client retries slow down the file writing
significantly. Hadoop-1093 should be able to fix this.
Hairong
-----Original Message-----
From: Raghu Angadi [mailto:rangadi@yahoo-inc.com]
Sent: Thursday, March 29, 2007 2:49 PM
To: hadoop-dev@lucene.apache.org
Subject: Re: [jira] Updated: (HADOOP-1180) NNbench test should be able to
test the checksumfilesystem as well as the raw filesystem
Nigel Daley wrote:
> So shouldn't fixing this test to conform to the new model in
> HADOOP-1134 be the concern of the patch for HADOOP-1134? As it stand,
> I can't run NNBench at scale without using a raw file system, which is
> what this patch is intended to allow. HADOOP-928 caused this test to
> use a ChecksumFileSystem and subsequently we saw our "read" TPS metric
> plummet from 20,000 to a couple hundred.
Wow! this would be a good test for 1134. I didn't expect the TPS to be so
different. I would expect TPS to remain closer to 20000 with CRCs with 1134
Raghu.
> Let's get our current benchmark back on track before we commit
> HADOOP-1134 (which will likely take a while before it is "Patch
> Available").
>
> On Mar 29, 2007, at 11:29 AM, Doug Cutting (JIRA) wrote:
>
>>
>> [
>> https://issues.apache.org/jira/browse/HADOOP-1180?page=com.atlassian.
>> jira.plugin.system.issuetabpanels:all-tabpanel
>> ]
>>
>> Doug Cutting updated HADOOP-1180:
>> ---------------------------------
>>
>> Status: Open (was: Patch Available)
>>
>> -1 This patch may be rendered obsolete by HADOOP-1134. And, the way
>> it is written, the 'useChecksum=false' mode will silently fail to
>> work once HADOOP-1134 is completed. So, if we feel we'll want to
>> continue to support this feature after HADOOP-1134, then we should
>> add an explicit way of constructing an HDFS FileSystem that does not
>> perform checksumming, rather than relying on 'instanceof
ChecksumFileSystem'.
>>
>>> NNbench test should be able to test the checksumfilesystem as well
>>> as the raw filesystem
>>> --------------------------------------------------------------------
>>> --------------------
>>>
>>>
>>> Key: HADOOP-1180
>>> URL: https://issues.apache.org/jira/browse/HADOOP-1180
>>> Project: Hadoop
>>> Issue Type: Bug
>>> Components: dfs
>>> Reporter: dhruba borthakur
>>> Assigned To: dhruba borthakur
>>> Attachments: nnbench.patch
>>>
>>>
>>> The NNbench test should have the option of testing a file system
>>> with checksums turned on and with checksums turned off. The original
>>> behaviour of nnbench test was to test hdfs without checksums.
>>
>> --This message is automatically generated by JIRA.
>> -
>> You can reply to this email to add a comment to the issue online.
>>
>
Re: [jira] Updated: (HADOOP-1180) NNbench test should be able to
test the checksumfilesystem as well as the raw filesystem
Posted by Raghu Angadi <ra...@yahoo-inc.com>.
Nigel Daley wrote:
> So shouldn't fixing this test to conform to the new model in HADOOP-1134
> be the concern of the patch for HADOOP-1134? As it stand, I can't run
> NNBench at scale without using a raw file system, which is what this
> patch is intended to allow. HADOOP-928 caused this test to use a
> ChecksumFileSystem and subsequently we saw our "read" TPS metric plummet
> from 20,000 to a couple hundred.
Wow! this would be a good test for 1134. I didn't expect the TPS to be
so different. I would expect TPS to remain closer to 20000 with CRCs
with 1134
Raghu.
> Let's get our current benchmark back on track before we commit
> HADOOP-1134 (which will likely take a while before it is "Patch
> Available").
>
> On Mar 29, 2007, at 11:29 AM, Doug Cutting (JIRA) wrote:
>
>>
>> [
>> https://issues.apache.org/jira/browse/HADOOP-1180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
>> ]
>>
>> Doug Cutting updated HADOOP-1180:
>> ---------------------------------
>>
>> Status: Open (was: Patch Available)
>>
>> -1 This patch may be rendered obsolete by HADOOP-1134. And, the way
>> it is written, the 'useChecksum=false' mode will silently fail to work
>> once HADOOP-1134 is completed. So, if we feel we'll want to continue
>> to support this feature after HADOOP-1134, then we should add an
>> explicit way of constructing an HDFS FileSystem that does not perform
>> checksumming, rather than relying on 'instanceof ChecksumFileSystem'.
>>
>>> NNbench test should be able to test the checksumfilesystem as well as
>>> the raw filesystem
>>> ----------------------------------------------------------------------------------------
>>>
>>>
>>> Key: HADOOP-1180
>>> URL: https://issues.apache.org/jira/browse/HADOOP-1180
>>> Project: Hadoop
>>> Issue Type: Bug
>>> Components: dfs
>>> Reporter: dhruba borthakur
>>> Assigned To: dhruba borthakur
>>> Attachments: nnbench.patch
>>>
>>>
>>> The NNbench test should have the option of testing a file system with
>>> checksums turned on and with checksums turned off. The original
>>> behaviour of nnbench test was to test hdfs without checksums.
>>
>> --This message is automatically generated by JIRA.
>> -
>> You can reply to this email to add a comment to the issue online.
>>
>
Re: [jira] Updated: (HADOOP-1180) NNbench test should be able to test the checksumfilesystem as well as the raw filesystem
Posted by Nigel Daley <nd...@yahoo-inc.com>.
So shouldn't fixing this test to conform to the new model in
HADOOP-1134 be the concern of the patch for HADOOP-1134? As it
stand, I can't run NNBench at scale without using a raw file system,
which is what this patch is intended to allow. HADOOP-928 caused
this test to use a ChecksumFileSystem and subsequently we saw our
"read" TPS metric plummet from 20,000 to a couple hundred.
Let's get our current benchmark back on track before we commit
HADOOP-1134 (which will likely take a while before it is "Patch
Available").
On Mar 29, 2007, at 11:29 AM, Doug Cutting (JIRA) wrote:
>
> [ https://issues.apache.org/jira/browse/HADOOP-1180?
> page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
>
> Doug Cutting updated HADOOP-1180:
> ---------------------------------
>
> Status: Open (was: Patch Available)
>
> -1 This patch may be rendered obsolete by HADOOP-1134. And, the
> way it is written, the 'useChecksum=false' mode will silently fail
> to work once HADOOP-1134 is completed. So, if we feel we'll want
> to continue to support this feature after HADOOP-1134, then we
> should add an explicit way of constructing an HDFS FileSystem that
> does not perform checksumming, rather than relying on 'instanceof
> ChecksumFileSystem'.
>
>> NNbench test should be able to test the checksumfilesystem as well
>> as the raw filesystem
>> ---------------------------------------------------------------------
>> -------------------
>>
>> Key: HADOOP-1180
>> URL: https://issues.apache.org/jira/browse/
>> HADOOP-1180
>> Project: Hadoop
>> Issue Type: Bug
>> Components: dfs
>> Reporter: dhruba borthakur
>> Assigned To: dhruba borthakur
>> Attachments: nnbench.patch
>>
>>
>> The NNbench test should have the option of testing a file system
>> with checksums turned on and with checksums turned off. The
>> original behaviour of nnbench test was to test hdfs without
>> checksums.
>
> --
> This message is automatically generated by JIRA.
> -
> You can reply to this email to add a comment to the issue online.
>
[jira] Updated: (HADOOP-1180) NNbench test should be able to test
the checksumfilesystem as well as the raw filesystem
Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Doug Cutting updated HADOOP-1180:
---------------------------------
Status: Open (was: Patch Available)
-1 This patch may be rendered obsolete by HADOOP-1134. And, the way it is written, the 'useChecksum=false' mode will silently fail to work once HADOOP-1134 is completed. So, if we feel we'll want to continue to support this feature after HADOOP-1134, then we should add an explicit way of constructing an HDFS FileSystem that does not perform checksumming, rather than relying on 'instanceof ChecksumFileSystem'.
> NNbench test should be able to test the checksumfilesystem as well as the raw filesystem
> ----------------------------------------------------------------------------------------
>
> Key: HADOOP-1180
> URL: https://issues.apache.org/jira/browse/HADOOP-1180
> Project: Hadoop
> Issue Type: Bug
> Components: dfs
> Reporter: dhruba borthakur
> Assigned To: dhruba borthakur
> Attachments: nnbench.patch
>
>
> The NNbench test should have the option of testing a file system with checksums turned on and with checksums turned off. The original behaviour of nnbench test was to test hdfs without checksums.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Resolved: (HADOOP-1180) NNbench test should be able to test
the checksumfilesystem as well as the raw filesystem
Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
dhruba borthakur resolved HADOOP-1180.
--------------------------------------
Resolution: Won't Fix
We will continue to run NNbench with the checksum file system.
> NNbench test should be able to test the checksumfilesystem as well as the raw filesystem
> ----------------------------------------------------------------------------------------
>
> Key: HADOOP-1180
> URL: https://issues.apache.org/jira/browse/HADOOP-1180
> Project: Hadoop
> Issue Type: Bug
> Components: dfs
> Reporter: dhruba borthakur
> Assigned To: dhruba borthakur
> Attachments: nnbench.patch
>
>
> The NNbench test should have the option of testing a file system with checksums turned on and with checksums turned off. The original behaviour of nnbench test was to test hdfs without checksums.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-1180) NNbench test should be able to test
the checksumfilesystem as well as the raw filesystem
Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
dhruba borthakur updated HADOOP-1180:
-------------------------------------
Attachment: nnbench.patch
The default behaviour of the nnbench benchmark is to measure a file system without checksum. If the "-useChecksum" option is specified, then it measures performance with checksums switched on.
> NNbench test should be able to test the checksumfilesystem as well as the raw filesystem
> ----------------------------------------------------------------------------------------
>
> Key: HADOOP-1180
> URL: https://issues.apache.org/jira/browse/HADOOP-1180
> Project: Hadoop
> Issue Type: Bug
> Components: dfs
> Reporter: dhruba borthakur
> Assigned To: dhruba borthakur
> Attachments: nnbench.patch
>
>
> The NNbench test should have the option of testing a file system with checksums turned on and with checksums turned off. The original behaviour of nnbench test was to test hdfs without checksums.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-1180) NNbench test should be able to test
the checksumfilesystem as well as the raw filesystem
Posted by "Nigel Daley (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12485035 ]
Nigel Daley commented on HADOOP-1180:
-------------------------------------
+1 code review
This is just a test fix.
> NNbench test should be able to test the checksumfilesystem as well as the raw filesystem
> ----------------------------------------------------------------------------------------
>
> Key: HADOOP-1180
> URL: https://issues.apache.org/jira/browse/HADOOP-1180
> Project: Hadoop
> Issue Type: Bug
> Components: dfs
> Reporter: dhruba borthakur
> Assigned To: dhruba borthakur
> Attachments: nnbench.patch
>
>
> The NNbench test should have the option of testing a file system with checksums turned on and with checksums turned off. The original behaviour of nnbench test was to test hdfs without checksums.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.